CN114491107A - Layered block organization method based on five-layer fifteen-level remote sensing tile data - Google Patents

Layered block organization method based on five-layer fifteen-level remote sensing tile data Download PDF

Info

Publication number
CN114491107A
CN114491107A CN202111571280.8A CN202111571280A CN114491107A CN 114491107 A CN114491107 A CN 114491107A CN 202111571280 A CN202111571280 A CN 202111571280A CN 114491107 A CN114491107 A CN 114491107A
Authority
CN
China
Prior art keywords
slicing
tile
remote sensing
level
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111571280.8A
Other languages
Chinese (zh)
Inventor
吴森森
余佳鸣
戚劲
杨典华
曾杉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Jixin Kunshan Information Technology Co ltd
Zhejiang University ZJU
Original Assignee
Zhongke Jixin Kunshan Information Technology Co ltd
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Jixin Kunshan Information Technology Co ltd, Zhejiang University ZJU filed Critical Zhongke Jixin Kunshan Information Technology Co ltd
Priority to CN202111571280.8A priority Critical patent/CN114491107A/en
Publication of CN114491107A publication Critical patent/CN114491107A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Abstract

The invention discloses a layered block organization method based on five-layer fifteen-level remote sensing tile data. The method comprises the following steps: 1) carrying out distributed multi-thread parallel slicing on the original image to generate five-layer and fifteen-level remote sensing tile data; 2) aiming at the obtained five-layer and fifteen-level remote sensing tile data, coding each tile according to the tile level and the row and column number of the tile; 3) and constructing a two-level index based on the obtained code of each tile, and realizing the hierarchical block organization of the five-level and fifteen-level remote sensing tile data by using a hierarchical block organization model. The method fully combines the segmentation standard and the hierarchical characteristics of the five-layer fifteen-level remote sensing tile, solves the storage problem caused by the generation of massive small files when the total number of the data blocks of the five-layer fifteen-level remote sensing tile is rapidly increased at a low level, and enables the data of the five-layer fifteen-level remote sensing tile to be efficiently stored and accessed in a distributed file system.

Description

Layered block organization method based on five-layer fifteen-level remote sensing tile data
Technical Field
The invention belongs to the field of remote sensing tile data, and particularly relates to a layered block organization method based on five-layer and fifteen-level remote sensing tile data.
Background
With the reduction of five-layer fifteen-level hierarchical levels, the coverage area of each tile is reduced, and the total number of tile data blocks required to be segmented into remote sensing image data under the levels is exponentially increased, so that an immeasurable small tile file is generated. The distributed file storage system is basically oriented to the storage of large files, and research on the optimization of the storage performance of a large number of small files is limited. In order to enable GCF data to be stored and accessed efficiently on a distributed file system, the GCF data is hierarchically organized in blocks according to the data read-write principle of the traditional storage equipment to form a large data set file.
The GCF data layering and blocking organization model solves the problem that access efficiency of a distributed file system is low due to the fact that massive small files or data blocks easily cause the problem based on storage characteristics of the distributed file system, and a user can quickly obtain five-layer and fifteen-level remote sensing tile data of a target area through unique coding and secondary indexes of tile data blocks.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a layered block organization method based on five-layer fifteen-level remote sensing tile data.
In order to realize the purpose of the invention, the technical scheme is as follows:
a hierarchical block organization method based on five-layer fifteen-level remote sensing tile data comprises the following steps:
s1: aiming at all remote sensing images to be sliced, a main node of a distributed system firstly divides image slicing tasks of the remote sensing images, distributes the slicing tasks of different remote sensing images to different nodes of a slicing task cluster, each slicing node independently carries out distributed multi-thread parallel slicing on the slicing tasks of a single remote sensing image, and five-layer and fifteen-level remote sensing tile data are obtained after the slicing tasks of all the remote sensing images are completely executed;
the distributed multithreading parallel slicing is realized based on a hierarchical classification division method and a queuing polling method, firstly, a slicing task of a single remote sensing image distributed by a current slicing node is read according to the hierarchical classification division method, all target tiles to be sliced in the slicing task are classified according to a five-layer fifteen-level hierarchical division standard, all the tiles belonging to the same tile layer are classified into the same category, then the highest tile level contained in all the categories is taken out to respectively establish a slicing task unit with a first priority, all the slicing task units with the first priority are subjected to slicing tasks in parallel by the queuing polling method, when all the slicing task units with the first priority finish the slicing tasks, the rest tiles in each category are respectively established into slicing task units with a second priority, all the slicing task units with the second priority are subjected to slicing tasks in parallel by the queuing polling method, finally, slicing of all target tile levels in the single remote sensing image is completed; when the queuing polling method is executed, the number of tiles of which the images need to be segmented in each slicing task unit is determined, then the slicing tasks of the tiles are numbered and sequenced to form a task queue, different threads in a thread pool sequentially claim the tile slicing tasks and execute the tile slicing tasks according to the task sequencing in the task queue, and after each thread finishes the slicing task of one tile, new tasks are sequentially claimed from the task queue until all the tiles in the slicing task unit are sliced;
s2: aiming at the five-layer and fifteen-level remote sensing tile data obtained in the S1, coding each tile according to the tile level and the row and column number of the tile;
s3: and constructing a two-level index based on the codes of each tile obtained in the S2, and realizing the hierarchical block organization of the five-level and fifteen-level remote sensing tile data by using a hierarchical block organization model.
Preferably, the specific implementation steps of S1 are as follows:
s11: after receiving slicing requests of all remote sensing images to be sliced, the main node creates a slicing task list and submits slicing tasks in the slicing task list to the Kafka cluster;
s13: after receiving the slicing tasks, the Kafka cluster caches the slicing tasks, and stores the slicing tasks into different partitions (partitions) according to a balance strategy, wherein each slicing task corresponds to a slice of a single remote sensing image;
s14: each slicing node (Consumer) in the slicing task cluster continuously pulls and takes slicing tasks from different partitions in sequence for execution through long connection established with Kafka, the partitions allocated to different slicing nodes are different and do not interfere with each other, and different slicing tasks are executed in parallel; each slicing node independently carries out distributed multi-thread parallel slicing on the slicing task of the single remote sensing image based on a hierarchical classification division method and a queuing polling method, and tile data blocks generated by each tile in the slicing process are output to an empty tile file and stored in a distributed file system;
s15: and after all slicing tasks cached in the Kafka cluster are completed, informing the main node that all slicing tasks of the remote sensing images are completed, and obtaining five-layer fifteen-level remote sensing tile data in the distributed file system.
Preferably, in S14, each slice node in the slice task cluster pulls the slice task from different partitions of the Kafka cluster according to the sequence of message reception.
Preferably, in S14, after each slice node draws a slice task of a single remote sensing image, first reading image metadata information, then determining whether a projection coordinate system of the single remote sensing image is a target coordinate system according to projection coordinates in the metadata information, if not, performing projection conversion on the projection coordinate system to form a remote sensing image in the target coordinate system, then loading remote sensing image data in the target coordinate system, reading a spatial range and a waveband of the image and a target tile level to be segmented, and finally performing the distributed multi-thread parallel slicing based on a hierarchical classification partition method and a queuing polling method.
Preferably, the target coordinate system is a WGS84 coordinate system.
Preferably, the image metadata information includes the number of image bands, the size of the image, and the projection coordinates.
Preferably, each slice node in the slice task cluster realizes slicing of the image by calling a GDAL library.
Preferably, the highest tile level is one of the five tile levels F, C, 9, 6, 3.
Preferably, in S2, a coding structure shaped by 64 bits is adopted to uniquely code the tile data block corresponding to each tile, and each tile data block forms a 64-bit coding value; in the coding structure with 64-bit long shaping, GeoHash coding is carried out on 0 th to 31 th storage tiles, wave band numbers of 32 th to 47 th storage tiles and tile level numbers of 48 th to 63 th storage tiles.
Preferably, in S3, the method for realizing the hierarchical block organization of the five-layer and fifteen-level remote sensing tile data by using the hierarchical block organization model is as follows:
storing tile data blocks of different tile levels in different remote sensing images in a distributed file system in a hierarchical manner by using tile data set files, wherein each tile data set file correspondingly stores all tile data blocks in one tile level of one remote sensing image; each tile data set file in the distributed file system realizes hierarchical block organization through two-level indexes, wherein a first-level index for inquiring different tile data set files is formed by a first key value pair, and a second-level index for inquiring different tile data blocks in the tile data set files is formed by a second key value pair; a key in the first key value pair is the only code of the tile data set file, and the value is a pointer pointing to a storage position of the secondary index; a key in the second key value pair is the 64-bit encoded value of a tile data block, a value being a pointer to a third key value pair of the tile data block; a key of the third key value is the 64-bit encoded value of a tile data block, the value being a binary data stream of the tile data block.
Compared with the prior art, the invention has the following beneficial effects:
1) according to the characteristics of five-layer and fifteen-level remote sensing tiles, a single image parallel slicing method based on a queuing polling method and a hierarchical classification dividing method is designed. The queuing polling method can reduce the idle time of the thread so as to improve the single-layer parallel slicing efficiency of the single image on the single machine; the hierarchical classification division method is based on a queuing polling method, and utilizes the hierarchical characteristics of five-layer fifteen-level remote sensing tiles to further improve the multi-level parallel slicing efficiency of a single image.
2) The invention provides a distributed parallel slicing method for a plurality of images based on Kafka, and the distributed parallel slicing algorithm combines the characteristics of a Kafka message middleware on the basis of a single-machine parallel slicing algorithm, so that multi-level parallel slicing of the plurality of images is realized on a distributed slicing cluster. The slicing methods can be well adapted to the characteristics of remote sensing image data and five-layer and fifteen-level remote sensing tile data, and lay a foundation for sharing and applying massive remote sensing image data.
3) The invention designs a 64-bit coding structure and a GridCubeFile structure which can realize the unique coding of a tile data block, thereby realizing the unified and complete data organization and storage of five-layer and fifteen-level remote sensing tile data and facilitating the utilization of the advantages of a distributed storage system on the storage of large files.
In conclusion, the five-layer fifteen-level remote sensing tile data storage method fully combines the segmentation standard and the hierarchical characteristics of the five-layer fifteen-level remote sensing tile, can solve the storage problem caused by the generation of massive small files when the total number of the five-layer fifteen-level remote sensing tile data blocks is rapidly increased in a low-level, and enables the five-layer fifteen-level remote sensing tile data to be efficiently stored and accessed in a distributed file system. The method has very important practical application value for the layered block organization of the five-layer and fifteen-level remote sensing tile data.
Drawings
FIG. 1 is a flow chart of steps of a hierarchical block organization method based on five-layer fifteen-level remote sensing tile data;
FIG. 2 is a schematic diagram illustrating division of a task for parallel slicing of a plurality of images;
FIG. 3 is a flow chart of distributed parallel slicing incorporating Kafka;
FIG. 4 is a schematic diagram of level E tiles and level D tiles;
FIG. 5 is a diagram illustrating slicing task partitioning by hierarchical classification partitioning according to an example;
FIG. 6 is a diagram of queued polling slicing task partitioning in an example;
FIG. 7 is a flow chart of single image slicing in slicing nodes;
FIG. 8 is a schematic diagram of a five-layer fifteen-level remote sensing tile data block encoding structure;
FIG. 9 is a schematic diagram of a five-layer and fifteen-level hierarchical block organization model of remote sensing tile data.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. The technical characteristics in the embodiments of the present invention can be combined correspondingly without mutual conflict.
For convenience of the following description, before discussing the specific technical solutions of the present invention, a description will be given of some basic definitions.
The specific hierarchical division standard of five levels of fifteen levels of the remote sensing tile data can refer to the standard definition, the remote sensing tile data is divided into five levels of 1 meter, 10 meters, 100 meters, 1000 meters and 10000 meters from a large range, each layer of a small range forms 3 levels by multiplying 2 and dividing by 2 respectively, the total number of levels is fifteen levels, the levels in each layer are classified according to the proportion of 0.5:1:2, and the proportion of the last level tile of the upper layer to the first level tile of the lower layer is 2.5: 1. . For convenience of description, the five layers are defined as tile layers, and 3 levels in each tile layer are defined as tile levels, so that a total of 5 tile layers (the 1 st tile layer, the 2 nd tile layer, the 3 rd tile layer, the 4 th tile layer and the 5 th tile layer are sequentially arranged from small to large in scale), 15 tile levels (the 1 st tile layer comprises the 1 st tile level, the 2 nd tile level and the 3 rd tile level which are arranged from small to large in scale, the 2 nd tile layer comprises the 4 th tile level, the 5 th tile level and the 6 th tile level which are arranged from small to large in scale, the 3 rd tile layer comprises the 7 th tile level, the 8 th tile level and the 9 th tile level which are arranged from small to large in scale, the 4 th tile layer comprises the A tile level, the B tile level and the C tile level which are arranged from small to large in scale, and the 5 th tile layer comprises the D tile level, the E tile level and the 5 th tile level which are arranged from small to large in scale, The tile level number of A, B, C, D, E, F, the F tile level, may also be equivalent to 10, 11, 12, 13, 14, 15 in some other definitions).
In addition, the five-layer fifteen-level specification model is only a division method in a remote sensing range, and can actually extend to two ends, for example, the resolution is extended to the periscopic direction by taking 10 mm, 1 mm and 0.1 mm as layers, and the resolution can also be extended to the macroscopic direction by taking 10 kilometers, 100 kilometers and 1000 kilometers as layers. The extended model, although more than five-layers and fifteen levels, is also referred to collectively in the art as five-layers and fifteen levels.
In a preferred embodiment of the present invention, as shown in fig. 1, a hierarchical block organization method based on five-layer fifteen-level remote sensing tile data is provided, which is shown in S1-S3 and described in detail below:
s1: aiming at all remote sensing images to be sliced, a main node of a distributed system firstly divides image slicing tasks, distributes the slicing tasks of different remote sensing images to different nodes of a slicing task cluster, each slicing node independently carries out distributed multi-thread parallel slicing aiming at the slicing task of a single remote sensing image, and five-layer and fifteen-level remote sensing tile data are obtained after the slicing tasks of all the remote sensing images are completely executed.
The step S1 is actually a distributed parallel slicing process of a plurality of images, as shown in fig. 2, a schematic diagram of dividing distributed parallel slicing tasks is shown, all remote sensing images to be sliced are respectively divided into different slicing tasks, and then different slicing nodes in a slicing task cluster are used to respectively execute corresponding distributed multi-thread parallel slicing. Each slice node is a task processing server that can be scheduled by the server that is the master node. In the invention, the distributed multithreading parallel slicing executed by each slicing node is realized based on a hierarchical classification and division method and a queuing and polling method, and the specific work flow of the distributed multithreading parallel slicing is described in detail later.
In this embodiment, distributed parallel slicing of the multiple remote sensing images can be implemented based on Kafka message middleware in the distributed system, so that parallel slicing efficiency of the multiple remote sensing images is improved. Kafka is a distributed message publishing and subscribing system, and a Kafka streaming platform can publish and subscribe a record stream (similar to a message queue), store the record stream in a fault-tolerant and persistent manner, and process the record stream when the record stream occurs, that is, process a data stream received by the platform in real time. In the process of five-layer and fifteen-layer tile slicing, the main node can record each slicing task as a record, then transmit the record to the Kafka cluster, distribute the tasks to each task processing server corresponding to the slicing node, and specifically process the responsible task data and requests by the task processing server to complete five-layer and fifteen-layer tile slicing. Wherein the master node corresponds to a Producer (Producer) in Kafka and the slicer node in the slicer task cluster corresponds to a Consumer (Consumer) in Kafka. Kafka's message classification is performed in units of topics (Topic), and according to the partitioning rule, a Topic can be divided into one or several partitions, and messages between different partitions are independent and non-repetitive. Different partitions (partitions) exist in the Kafka cluster, each Partition can be regarded as a Log file on the physical storage level, and the partitions are sequentially added to each Log file according to the receiving sequence of the messages.
As shown in fig. 3, the distributed parallel slicing process implemented by introducing Kafka in the distributed system includes the following specific steps:
s11: and selecting all the remote sensing images to be sliced, submitting slicing requests of a plurality of images to the main node, creating a slicing task list after the main node receives the slicing requests of all the remote sensing images to be sliced, and submitting slicing tasks in the slicing task list to the Kafka cluster.
S13: after receiving the slicing tasks, the Kafka cluster caches the slicing tasks, and stores the slicing tasks into different partitions according to a balance strategy, wherein each slicing task corresponds to a slice of a single remote sensing image. The slicing tasks stored in the partitions can be subsequently distributed to different slicing nodes according to the receiving sequence of the messages corresponding to the slicing tasks, namely, the slicing task received first is distributed to the slicing nodes first, and then the slicing task received later is distributed to the slicing nodes, so that all the slicing tasks are executed in sequence.
S14: each slicing node in the slicing task cluster continuously pulls the slicing tasks from different partitions in sequence for execution through long connection established with Kafka, the partitions allocated to different slicing nodes are different and do not interfere with each other, and different slicing tasks are executed in parallel. Each slicing node independently performs distributed multi-thread parallel slicing on the slicing task of the single remote sensing image based on a hierarchical classification division method and a queuing polling method (a specific distributed multi-thread parallel slicing process will be described in detail later), and tile data blocks generated by each tile in the slicing process are output to an empty tile file and stored in a distributed file system.
It should be noted that each slicing node in the slicing task cluster may pull the slicing task from different partitions of the Kafka cluster according to the order of message reception.
S15: and after all slicing tasks cached in the Kafka cluster are completed, informing the main node that all slicing tasks of the remote sensing images are completed, and obtaining five-layer fifteen-level remote sensing tile data in the distributed file system.
In each slicing node of the invention, distributed multithreading parallel slicing is executed, and the method is realized based on two improved methods of a hierarchical classification partition method and a queuing polling method. The following first describes specific implementations of these two methods.
The essence of the five-layer fifteen-level remote sensing tiling process is that according to the hierarchy division standard of the five-layer fifteen-level remote sensing tiles, images are cut into a plurality of small images in batches and stored in a distributed file system. Different from the traditional quad-tree, in which the resolution between the upper and lower tiles is 1:2, the mapping relationship between the upper and lower tiles of the five-layer fifteen level is not a simple 1:2 or 1:4 relationship, and a complete set of tiles cannot be obtained by simple clipping or splicing of the upper and lower tiles. As shown in fig. 4, the ratio of the tile level E to the tile level D is 2.5:1, the spatial range of 4 tiles of the tile level E is the same as the spatial range of 25 tiles of the tile level D, the tiles of the tile level D cannot be obtained by directly splitting the tiles of the tile level E, and the middle cross area (red area) can be obtained only by splicing and cutting the tiles of the tile level D; the tile level E can not be obtained by tile level D splicing directly, and can be obtained by splicing and then cutting the tiles of the tile level D. Therefore, when a multi-level parallel slicing task is required for a single image, the invention does not singly use a certain level of tiles to continuously slice downwards until the slicing of all the level of tiles is completed, but proposes a level classification and division method.
The hierarchical classification and division method has the main idea that a target hierarchy is classified according to tile layers, and then the target hierarchy is classified according to the size proportion of the space range among different hierarchies, so that the condition that tiles are required to be spliced and cut is avoided. As known from a five-layer fifteen-level remote sensing tile standardization grading mode, the F, C, 9, 6 and 3 levels are one tile level with the largest tile scale in each tile layer, meanwhile, the proportion of the F, C, 9, 6 and 3 levels to the other two tile levels is 1:2 and 1:5, and the tiles of the two levels can be directly obtained by splitting the tile level with the largest scale. Therefore, after all the levels are classified according to the tile layers, the largest tile level in each category is firstly segmented, and if the rest two tile levels exist, the image of the largest tile level is directly segmented without being segmented step by step. Therefore, the image slicing tasks of different tile levels can be divided into the slicing task units of different priorities, and the slicing task units of the same priority can be executed in parallel. Experiments prove that compared with the traditional slicing method without hierarchical classification, the hierarchical classification and division method can obviously improve the slicing efficiency of multi-level parallel slices of a single image, and the improvement effect is more obvious as the levels of classifiable division are more.
The queuing polling method is used for realizing single-level parallel slicing of a single image in a slicing task unit, and needs to be combined with a thread pool and executed in parallel through multiple threads. The task division of the queuing polling method is to determine the number of tiles after the image is segmented and then determine the row and column numbers of each tile. The tiles are numbered and queued and then executed in order by the thread tasks in the thread pool. And after the thread executes the allocated tile slicing task, requesting the tile queue to execute the slicing task of the next tile. And repeatedly executing the allocation operation by each thread until no residual tiles need to be sliced, and completing the parallel slicing task of the remote sensing image. Compared with the distribution mode of the wheel method according to the line number and the row number of the tiles in a one-to-one correspondence mode, the distribution mode can reduce the idle time of the threads, and does not need to wait for all the threads to finish one round of slicing tasks and then carry out the next round of tile slicing tasks. Experiments prove that compared with a single-thread slicing method and an OpenMP-based parallel slicing method, the queuing polling method disclosed by the invention can effectively improve the efficiency of single-layer slicing of a single image.
In this embodiment, the hierarchical classification and division method and the queuing and polling method are used in combination, and the slicing task unit is divided by the hierarchical classification and division method first, and then the slicing task unit is sliced in multiple threads by the queuing and polling method. Therefore, each slicing node executes multi-thread parallel slicing on the pulled slicing task based on the hierarchical classification and division method and the queuing polling method, and the specific adopted work flow is as follows:
firstly, reading a slicing task of a single remote sensing image distributed by a current slicing node according to a hierarchical classification division method, classifying all target tile levels to be segmented in the slicing task according to a five-layer fifteen-level hierarchical division standard, classifying all tile levels belonging to the same tile layer into the same category, then taking out the highest tile level (one of five tile levels of F, C, 9, 6 and 3) contained in all the categories respectively to respectively establish a slicing task unit with a first priority, carrying out the slicing task by all the slicing task units with the first priority in parallel through a queuing polling method, establishing slicing task units with a second priority respectively for the rest tile levels in each category after all the slicing task units with the first priority are finished, carrying out the slicing task by all the slicing task units with the second priority in parallel through the queuing polling method, and finally, slicing all target tile levels in the single remote sensing image. As described above, after the slicing task units of the first priority and the slicing task units of the second priority are divided, the slicing task units with the same priority are divided into the slicing tasks by a queuing and polling method, when the queuing and polling method is executed, the number of tiles to be sliced of an image in each slicing task unit (namely the slicing task unit with the first priority or the slicing task unit with the second priority) is determined, then the slicing tasks of the tiles are numbered and ordered according to the row number and the column number of each tile to form a task queue, different threads in a thread pool sequentially claim the tile slicing tasks and execute the tile slicing tasks according to the task ordering in the task queue, and each thread finishes the slicing task of one tile and then sequentially claims new tasks from the task queue until all the tiles in the slicing task unit are sliced.
For example, as shown in fig. 5, after a slicing task is obtained in a slicing node, the image to be cut needs to be sliced to six levels, namely, levels 9, 8, 7, 6, 5 and 4, and since the ratio between different tile levels is 1:2 and 1:5, the task is classified according to the tile level by a hierarchical classification and division method according to the different ratios, and then is divided according to the ratio of each tile level. 9. The 8 and 7-level tiles belong to the 3 rd layer, and the 6, 5 and 4-level tiles belong to the 2 nd layer, so that the tiles to be sliced are divided into two slicing task units with first priority according to the layer number, one slicing task unit with first priority is used for slicing the 9 th layer, and the other slicing task unit with first priority is used for slicing the 6 th layer. And after the 9 th-level slicing task and the 6 th-level slicing task are finished, performing task subdivision according to the tile level. On the basis of the 9 th-level tile, continuously dividing the tile into two second-priority slicing task units, wherein one slicing task unit is a slicing task unit divided into four slicing task units according to the proportion of 1:2 of the side length of the tile, and producing an 8 th-level tile; and the second step is to carry out a cutting task of dividing the tile into twenty-five parts according to the proportion of 1:5 of the side length of the tile and produce the 7 th-level tile. Similarly, the level 5 and level 4 tile slicing tasks are two second priority slicing task units based on the level 6 tile. All the slicing task units with the second priority can be executed in parallel, and the sequencing does not exist. And when the tiles of all the layers are successfully generated, the multi-layer parallel slicing task of the to-be-sliced image is finished.
Of course, there are 3 complete tile levels in each tile layer in the above example, but in fact the present invention does not require that there be 3 complete tile levels in each tile layer. Table 2 exemplifies several other multi-level combined slice cases.
TABLE 2 Multi-level combination case
Figure BDA0003423800710000081
The different multi-level combination situations in table 2 can be divided into task units by the hierarchical classification and division method of the present invention. In the 1 st level combination, the 9 th level slicing is firstly carried out in the slicing process, and then the 8 th and 7 th level slicing is carried out in parallel on the basis of the 9 th level; in the slicing process of the 2 nd-level combination, firstly, slicing of 9 th and 6 th levels is performed in parallel, and then, slicing of 8 th and 7 th levels is performed in parallel on the basis of the 9 th level; in the slicing process of the 3 rd level combination, firstly slicing 9 th and 6 th levels in parallel, and then slicing 8 th and 5 th levels in parallel, wherein the 8 th level is based on the 9 th level, and the 5 th level is based on the 6 th level; in the slicing process, slicing at 9 th and 6 th levels is performed in parallel, and then slicing at 8 th, 7 th and 5 th levels is performed in parallel, wherein the 8 th and 7 th levels are based on the 9 th level, and the 5 th level is based on the 6 th level.
In the above example, regardless of the slice task unit of the first priority or the slice task unit of the second priority, each slice task unit executes a slice task by calling the thread pool according to the queued polling method. As shown in fig. 6, taking one of the slicing task units as an example, under the five-tier fifteen-tier remote sensing tile hierarchical standard, a remote sensing image to be sliced needs to be sliced into 30 tiles at the current tier, the tiles are numbered and then sequentially queued to form a task list, and the number of threads in a thread pool is set to 6 according to the performance of a server, so that the image is sliced by 6 threads in parallel. In the first round, six tile areas numbered 0-5 are allocated to 6 threads on average; in the second round, threads 1, 3, 4, 6 have already executed the slicing task of the current tile and initiated a new slicing execution request, so 6 threads in this round execute slices of tile regions numbered 6, 1, 7, 8, 4, 9, respectively; in the third round, threads 2, 4, 5, and 6 end the slicing task of the current tile and continue to initiate new slicing execution requests, so that 6 threads in this round execute slices of tile areas numbered 6, 10, 7, 11, 12, and 13, respectively; by analogy, up to the last round, threads 1, 3, 5 have performed the slicing tasks for tile regions numbered 27, 28, 29, respectively, and the slicing tasks for the remaining regions have ended. And all slicing tasks of the whole image under the current target level are completed.
In addition, the foregoing steps describe the partition manner for the slicing task in each slicing node, and the slicing task practice specifically performed therein can be implemented by referring to a five-layer fifteen-level tile slicing method in the prior art. As shown in fig. 7, a slicing task process executed after each slicing node pulls a slicing task of a single remote sensing image in the present embodiment is described:
firstly, after each slicing node draws a slicing task of a single remote sensing image, image pixel data information including the number of image wave bands, the size (length and width) of the image and projection coordinates is read.
And then, judging whether the projection coordinate system is a WGS84 coordinate system or not according to the projection coordinates in the metadata information, and performing projection conversion on the projection coordinate system to form a remote sensing image in a WGS84 coordinate system if the projection coordinate system is not the WGS84 coordinate system.
And then loading the remote sensing image data under the WGS84 coordinate system, and reading the space range and the wave band of the image and the target tile level to be segmented.
The transformation of the WGS84 coordinate system is prior art, and one possible transform algorithm may be used as follows;
Figure BDA0003423800710000091
and finally, performing the multi-thread parallel slicing based on the hierarchical classification and division method and the queuing and polling method. Specifically, each slice node in the slice task cluster can realize slicing of the image by calling the GDAL library, and after all slice tasks are completed, all image slice tasks are notified to be completed.
S2: and for the five-layer and fifteen-layer remote sensing tile data obtained in the S1, coding each tile according to the tile level and the row and column number of the tile.
In this embodiment, five-layer and fifteen-layer remote sensing tile data can be encoded according to uniqueness of naming of source data (original remote sensing image data) of the tile data and certainty of target layer level. Specifically, the present embodiment uniquely encodes the tile data block corresponding to each tile by using a 64-bit long shaped coding structure, which is shown in fig. 8. The structure is a 64-bit long shaping coding structure, wherein GeoHash coding (hereinafter referred to as 'G code') is carried out on 0 th to 31 th storage tiles, the band numbers of the 32 th to 47 th storage tiles and the level numbers of the 48 th to 63 th storage tiles are the tile level numbers. Let level be tile level, band be the band number contained in the tile, and GeoHashCode be the geospatial position encoding G code of the tile data block.
The way tile data block code N (abbreviated N code) is computed can be expressed as follows:
N=((long)geoHashCode)|((long)band)<<32|((long)level)<<48
based on the coding structure, the GeoHash codes of the tiles, the tile levels and the wave band numbers can be obtained again by calculating the N codes as follows:
GeoHashCode=N&0xFFFFFFFFFFL
band=0xFFFFL&(N>>32)
level=0xFFFFL&(N>>48)
s3: and constructing a two-level index based on the codes of each tile obtained in the S2, and realizing the hierarchical block organization of the five-level and fifteen-level remote sensing tile data by using a hierarchical block organization model.
In this embodiment, a gridcubfile structure is designed for a layered block organization model, and the gridcubfile structure is used to realize the layered block organization of five-layer and fifteen-level remote sensing tile data, and the specific layered block organization method is as follows:
and storing tile data blocks of different tile levels in different remote sensing images in a distributed file system in a hierarchical manner by using tile data set files, wherein each tile data set file correspondingly stores all the tile data blocks in one tile level of one remote sensing image, and different tile data set files correspondingly store all the tile data blocks in different tile levels of different remote sensing images. Each tile data set file in the distributed file system realizes hierarchical block organization through two-level indexes, wherein a first-level index for inquiring different tile data set files is formed by a first index key value pair, and a second-level index for inquiring different tile data blocks in the tile data set files is formed by a second index key value pair; a key in the first index key value pair is the only code of the tile data set file, and the value is a pointer pointing to a storage position of the secondary index; a key in the second index key value pair is the 64-bit encoded value of a tile data block, the value being a pointer to a data block key value pair of the tile data block; a key of the data block key values is the 64-bit encoded value of a tile data block, the value being a binary data stream of the tile data block.
For example, as shown in fig. 9, the hierarchical block organization model in this embodiment may implement unique coding on five-level and fifteen-level remote sensing tile data according to uniqueness of naming of source data (original remote sensing image data) of the tile data and certainty of a target level, and form a queue of tile data block indexes of different levels of different images with < ike, IP > key/value pairs, which is a level one index, IKey is a unique coding of a tile data set file, and IP points to a position of the tile data block index. The queue of tile block indices for the same level of the same image is formed with < N, P > key/value pairs, which is a secondary index, P pointing to the location where the unique code N (i.e., the aforementioned N-code) of the tile block is stored. And then forming a queue of tile data blocks (D is a tile data block binary stream) by the key/value pairs of < N, D >, and forming a hierarchical tile large data set file by five layers of fifteen-level remote sensing tiles according to different hierarchies, so that the storage in the distributed file system is convenient to realize. Therefore, in fig. 9, it can be seen that the GridCubeFile data realizes hierarchical storage of five-layer and fifteen-layer remote sensing tiles through a hierarchical block organization model, and all tile data blocks of each hierarchy are organized in blocks to form a corresponding tile data set file. [ IKey1, IKey2, … and IKeyn ] is a unique code determined by the level number of the tile and the name of source data of a five-layer fifteen-level remote sensing tile data set file (all tile data sets of a remote sensing image at a certain level under a five-layer fifteen-level organization model); [ N1, N2, …, Nn ] is N codes corresponding to the five-layer fifteen-level remote sensing tile data blocks, [ D1, D2, …, Dn ] is a binary stream representation result of the five-layer fifteen-level remote sensing tile data blocks corresponding to the N codes, and each < Ni, Di > stores one five-layer fifteen-level remote sensing tile data block. In the primary index, a unique code of a five-layer and fifteen-layer remote sensing data set is stored by a key/value pair of < IKEyi, IPi >, and a pointer IPi pointing to a storage position of the secondary index is also stored. And in the secondary index, storing the N code and the pointer Pi of the five-layer fifteen-level remote sensing tile data block by using a key/value pair of < Ni, Pi >, wherein the pointer Pi points to the corresponding key/value pair in the five-layer fifteen-level remote sensing tile data block queue.
Therefore, the GridCubeFile structure records the recorded Position information in the index data, and when reading the index data, the GridCubeFile abandons the mode of loading all the index data into the memory and traversing, but adopts a new method, namely, a specified Position (targetPosition) is obtained according to the pointer so as to directly obtain the Key value pair < Key, Position >, further find the Position of the data centralized record, and then directly access the Position.
For further ease of understanding, an achievable flow of writing five-tier fifteen-level remote sensing tile data into GridCubeFile is described below as follows:
step 1: GridCubeFile data initialization
Connecting the distributed file server cluster, setting the initial position and the standard offset of the Index file, and creating an Index file (Index LinerFile) and a tile Data block storage file (Data LinerFile).
Step 2: metadata written to GridCubeFile data
And writing the metadata information into the tile data block storage file in a binary stream mode as a first record of the tile data block storage file. Before writing the binary stream of the metadata, the length of the metadata needs to be written first, so that the correctness of reading the metadata is ensured.
Step 3: cyclically writing index data and remote sensing tile data blocks
Index key/value pairs respectively<Ni,Pi>And remote sensing tile data block key/value pair<Ni,Di>And writing into the index file and the tile data block storage file. Firstly, the first step is to<Ni,Di>Writing the key/value pair into a tile data block storage file, and then writing the key/value pair of the remote sensing tile data block<Ni,Di>Actual memory location PiTo be provided with<Ni,Pi>The key/value pair form is written into the index file. Before writing the binary stream of the target remote sensing tile data block, the length of the target remote sensing tile data block needs to be written first, and the reading accuracy of the data block is ensured.
Step 4: closing GridCubeFile data read-write
And after writing of all the remote sensing tile data blocks is completed, closing the index file and the tile data block storage file, and releasing resources.
Additionally, the above index and hierarchical tile data sets may ultimately be stored in a distributed file system. In the storage process, the hierarchical tile data set can be distributed and stored in the distributed storage nodes after being split according to the data blocks. If the storage resources need to be further saved, a certain compression strategy can be adopted.
The above-described embodiments are merely preferred embodiments of the present invention, which should not be construed as limiting the invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, the technical scheme obtained by adopting the mode of equivalent replacement or equivalent transformation is within the protection scope of the invention.

Claims (10)

1. A layered block organization method based on five-layer fifteen-level remote sensing tile data is characterized by comprising the following steps:
s1: aiming at all remote sensing images to be sliced, a main node of a distributed system firstly divides image slicing tasks of the remote sensing images, distributes the slicing tasks of different remote sensing images to different nodes of a slicing task cluster, each slicing node independently carries out distributed multi-thread parallel slicing on the slicing tasks of a single remote sensing image, and five-layer and fifteen-level remote sensing tile data are obtained after the slicing tasks of all the remote sensing images are completely executed;
the distributed multithreading parallel slicing is realized based on a hierarchical classification division method and a queuing polling method, firstly, a slicing task of a single remote sensing image distributed by a current slicing node is read according to the hierarchical classification division method, all target tiles to be sliced in the slicing task are classified according to a five-layer fifteen-level hierarchical division standard, all the tiles belonging to the same tile layer are classified into the same category, then the highest tile level contained in all the categories is taken out to respectively establish a slicing task unit with a first priority, all the slicing task units with the first priority are subjected to slicing tasks in parallel by the queuing polling method, when all the slicing task units with the first priority finish the slicing tasks, the rest tiles in each category are respectively established into slicing task units with a second priority, all the slicing task units with the second priority are subjected to slicing tasks in parallel by the queuing polling method, finally, slicing of all target tile levels in the single remote sensing image is completed; when the queuing polling method is executed, the number of tiles of which the images need to be segmented in each slicing task unit is determined, then the slicing tasks of the tiles are numbered and sequenced to form a task queue, different threads in a thread pool sequentially claim the tile slicing tasks and execute the tile slicing tasks according to the task sequencing in the task queue, and after each thread finishes the slicing task of one tile, new tasks are sequentially claimed from the task queue until all the tiles in the slicing task unit are sliced;
s2: aiming at the five-layer and fifteen-level remote sensing tile data obtained in the S1, coding each tile according to the tile level and the row and column number of the tile;
s3: and constructing a two-level index based on the codes of each tile obtained in the S2, and realizing the hierarchical block organization of the five-layer and fifteen-level remote sensing tile data by using a hierarchical block organization model.
2. The hierarchical block organization method according to claim 1, characterized in that: the specific implementation steps of the S1 are as follows:
s11: after receiving slicing requests of all remote sensing images to be sliced, the main node creates a slicing task list and submits slicing tasks in the slicing task list to the Kafka cluster;
s13: after receiving the slicing tasks, the Kafka cluster caches the slicing tasks, and stores the slicing tasks into different partitions (partitions) according to a balance strategy, wherein each slicing task corresponds to a slice of a single remote sensing image;
s14: each slicing node (Consumer) in the slicing task cluster continuously pulls and takes slicing tasks from different partitions in sequence for execution through long connection established with Kafka, the partitions allocated to different slicing nodes are different and do not interfere with each other, and different slicing tasks are executed in parallel; each slicing node independently carries out distributed multi-thread parallel slicing on the slicing task of the single remote sensing image based on a hierarchical classification division method and a queuing polling method, and tile data blocks generated by each tile in the slicing process are output to an empty tile file and stored in a distributed file system;
s15: and after all slicing tasks cached in the Kafka cluster are completed, informing the main node that all slicing tasks of the remote sensing images are completed, and obtaining five-layer fifteen-level remote sensing tile data in the distributed file system.
3. The hierarchical block organization method according to claim 2, characterized in that: in S14, each slice node in the slice task cluster pulls the slice task from different partitions of the Kafka cluster according to the sequence of message reception.
4. The hierarchical block organization method according to claim 2, characterized in that: in the step S14, after each slice node draws a slice task of a single remote sensing image, image metadata information is read first, then it is determined whether a projection coordinate system of the single remote sensing image is a target coordinate system according to projection coordinates in the metadata information, if not, the single remote sensing image is subjected to projection conversion to form a remote sensing image in the target coordinate system, then remote sensing image data in the target coordinate system is loaded, a spatial range, a waveband, and a target tile level to be segmented of the image are read, and finally the distributed multithreading parallel slicing is performed based on a hierarchical classification partitioning method and a queuing polling method.
5. The hierarchical block organization method according to claim 4, characterized in that: the target coordinate system is a WGS84 coordinate system.
6. The hierarchical block organization method according to claim 4, characterized in that: the image pixel data information comprises the number of image wave bands, the size of an image and projection coordinates.
7. The hierarchical block organization method according to claim 3, characterized in that: and each slice node in the slice task cluster realizes the slicing of the image by calling the GDAL library.
8. The hierarchical block organization method according to claim 1, characterized in that: the highest tile level is one of the five tile levels F, C, 9, 6, 3.
9. The hierarchical block organization method according to claim 1, characterized in that: in S2, uniquely encoding the tile data block corresponding to each tile by using a 64-bit long-shaped encoding structure, where each tile data block forms a 64-bit encoding value; in the coding structure with 64-bit long shaping, GeoHash coding is carried out on 0 th to 31 th storage tiles, wave band numbers of 32 th to 47 th storage tiles and tile level numbers of 48 th to 63 th storage tiles.
10. The hierarchical block organization method according to claim 1, characterized in that: in the S3, the method for realizing the hierarchical block organization of the five-layer and fifteen-level remote sensing tile data by using the hierarchical block organization model is as follows:
storing tile data blocks of different tile levels in different remote sensing images in a distributed file system in a hierarchical manner by using tile data set files, wherein each tile data set file correspondingly stores all tile data blocks in one tile level of one remote sensing image; each tile data set file in the distributed file system realizes layered block organization through two-level indexes, wherein a first-level index used for inquiring different tile data set files is formed by a first index key value pair, and a second-level index used for inquiring different tile data blocks in the tile data set files is formed by a second index key value pair; a key in the first index key value pair is the only code of the tile data set file, and the value is a pointer pointing to a storage position of the secondary index; a key in the second index key value pair is the 64-bit encoded value of a tile data block, the value being a pointer to a data block key value pair of the tile data block; a key of the data block key values is the 64-bit encoded value of a tile data block, the value being a binary data stream of the tile data block.
CN202111571280.8A 2021-12-21 2021-12-21 Layered block organization method based on five-layer fifteen-level remote sensing tile data Withdrawn CN114491107A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111571280.8A CN114491107A (en) 2021-12-21 2021-12-21 Layered block organization method based on five-layer fifteen-level remote sensing tile data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111571280.8A CN114491107A (en) 2021-12-21 2021-12-21 Layered block organization method based on five-layer fifteen-level remote sensing tile data

Publications (1)

Publication Number Publication Date
CN114491107A true CN114491107A (en) 2022-05-13

Family

ID=81493785

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111571280.8A Withdrawn CN114491107A (en) 2021-12-21 2021-12-21 Layered block organization method based on five-layer fifteen-level remote sensing tile data

Country Status (1)

Country Link
CN (1) CN114491107A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100541A (en) * 2022-07-21 2022-09-23 文娟 Satellite remote sensing data processing method and system and cloud platform
CN116910290A (en) * 2023-09-12 2023-10-20 航天宏图信息技术股份有限公司 Method, device, equipment and medium for loading slice-free remote sensing image
CN117221336A (en) * 2023-11-08 2023-12-12 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) Remote sensing image release method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100541A (en) * 2022-07-21 2022-09-23 文娟 Satellite remote sensing data processing method and system and cloud platform
CN116910290A (en) * 2023-09-12 2023-10-20 航天宏图信息技术股份有限公司 Method, device, equipment and medium for loading slice-free remote sensing image
CN116910290B (en) * 2023-09-12 2023-12-26 航天宏图信息技术股份有限公司 Method, device, equipment and medium for loading slice-free remote sensing image
CN117221336A (en) * 2023-11-08 2023-12-12 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) Remote sensing image release method and system
CN117221336B (en) * 2023-11-08 2024-01-30 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) Remote sensing image release method and system

Similar Documents

Publication Publication Date Title
CN114491107A (en) Layered block organization method based on five-layer fifteen-level remote sensing tile data
CN109992636B (en) Space-time coding method, space-time index and query method and device
CN107423368B (en) Spatio-temporal data indexing method in non-relational database
CN102521386B (en) Method for grouping space metadata based on cluster storage
CN111291016B (en) Hierarchical hybrid storage and indexing method for massive remote sensing image data
CN103246749B (en) The matrix database system and its querying method that Based on Distributed calculates
CN112398899A (en) Software micro-service combination optimization method for edge cloud system
CN105700948A (en) Method and device for scheduling calculation task in cluster
CN103077197A (en) Data storing method and device
CN111125392A (en) Remote sensing image storage and query method based on matrix object storage mechanism
JP2017509043A (en) Graph data query method and apparatus
CN116302461A (en) Deep learning memory allocation optimization method and system
CN108140022B (en) Data query method and database system
CN105550180B (en) The method, apparatus and system of data processing
CN115994197A (en) GeoSOT grid data calculation method
CN111414961A (en) Task parallel-based fine-grained distributed deep forest training method
CN116708446B (en) Network performance comprehensive weight decision-based computing network scheduling service method and system
US8769550B1 (en) Reply queue management
CN116737370A (en) Multi-resource scheduling method, system, storage medium and terminal
CN114338718B (en) Distributed storage method, device and medium for massive remote sensing data
WO2021057824A1 (en) Method and apparatus for querying data, computing device, and storage medium
CN114035919A (en) Task scheduling system and method based on power distribution network layered distribution characteristics
WO2021123751A1 (en) Discrete optimisation
CN112632118A (en) Method, device, computing equipment and storage medium for querying data
Bai et al. Skyline-join query processing in distributed databases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220513

WW01 Invention patent application withdrawn after publication