CN112395453A - Self-adaptive distributed remote sensing image caching and retrieval method - Google Patents

Self-adaptive distributed remote sensing image caching and retrieval method Download PDF

Info

Publication number
CN112395453A
CN112395453A CN202011341421.2A CN202011341421A CN112395453A CN 112395453 A CN112395453 A CN 112395453A CN 202011341421 A CN202011341421 A CN 202011341421A CN 112395453 A CN112395453 A CN 112395453A
Authority
CN
China
Prior art keywords
remote sensing
sensing image
image
distributed
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011341421.2A
Other languages
Chinese (zh)
Other versions
CN112395453B (en
Inventor
黄伟
张东映
唐振超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202011341421.2A priority Critical patent/CN112395453B/en
Publication of CN112395453A publication Critical patent/CN112395453A/en
Application granted granted Critical
Publication of CN112395453B publication Critical patent/CN112395453B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0875Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0895Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/16General purpose computing application
    • G06F2212/163Server or database system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/26Using a specific storage system architecture
    • G06F2212/263Network storage, e.g. SAN or NAS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/45Caching of specific data in cache memory
    • G06F2212/455Image or video data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a self-adaptive distributed remote sensing image caching and retrieving method, and belongs to the field of spatial data information service. The method stores the remote sensing image with high frequency access, form information corresponding to the image, query acceptance time of a space query vector, request times and other information in a distributed cache; by adding the storage manager in the matrix abstract layer on the host of the matrix-oriented object storage system, the storage manager sends the high-frequency-access remote sensing image to other hosts, so that the high-frequency-access remote sensing image is copied and stored in the distributed cache, and the response speed of the system to the remote sensing image to be searched by a space query vector request is improved when multiple clients access simultaneously. The invention maps the image access request from the image access interface layer to the distributed cache by combining the query requests of the image blocks, thereby reducing the times of accessing the object storage system and the query waiting time of the image access interface layer.

Description

Self-adaptive distributed remote sensing image caching and retrieval method
Technical Field
The invention belongs to the field of spatial data information service, and particularly relates to a self-adaptive distributed remote sensing image caching and retrieval method.
Background
With the wide application of satellite remote sensing data, particularly in the field of news media, society is highly concerned about the change of population and ecological environment in local hot spot areas, and the importance of rapidly inquiring satellite remote sensing image data and providing preliminary analysis results is increasingly highlighted. Groups and individuals who use satellite remote sensing images to engage in public welfare activities often pay attention to local geospatial regions, periodic and high-frequency access to the satellite images. Therefore, the method for exploring and establishing the satellite image cache and query optimization in the hot spot area has extremely strong practical requirements and significance.
The current storage facilities for remote sensing images and blocks thereof comprise: distributed file systems, matrix-oriented object storage systems. First, a distributed file system, such as a classic HDFS file system, provides a high-throughput data blocking service, and improves the throughput rate of a remote sensing image by cutting the image into a plurality of blocks and copying and storing the blocks in a plurality of disk file systems of a plurality of hosts in a manner of establishing a metadata index table of the blocks (the throughput rate refers to the number of bytes that can be read by an application program per unit time, and usually has a unit of MB/S); second, a matrix-oriented object storage system, comprising: the system comprises an image access interface layer, a matrix abstraction layer, an object storage system and other components; the image access interface layer is an entry of the matrix-oriented object storage system, and when receiving a space vector query request, the entry converts the space vector query request into matrix block information which can be understood by the matrix abstraction layer, and then the matrix block information to be queried is converted into object address information stored in the object storage system by the matrix abstraction layer. After the corresponding image is inquired, the object storage system accesses the image and returns the image to the matrix abstraction layer, and the matrix abstraction layer returns the image to the image access interface layer; other components included in the matrix-oriented object storage system, such as a key-value pair storage system, are mainly used for storing metadata information of the telepresence images.
The distributed file system storage has no image hot spot concept, so that the function of quickly accessing image blocks with high access frequency is not provided. Compared with a distributed file system, the matrix-oriented object storage system has an image hotspot concept, and by storing image access frequency information in the key value storage system, the remote sensing image in which area is accessed at a high frequency within a period of time can be known. For ease of presentation and understanding, a matrix-oriented object storage system is represented by a matrix object storage system.
However, all matrix object storage systems do not provide hot spot area image blocking fast query service at present. The main reasons are: the distributed object storage database system cannot accurately master the use condition of the image of the hot spot area, and has certain access speed and time overhead to all objects (mainly limited by the access bandwidth of a client and a cloud host); in order to improve the access speed of the image in the hot spot area, a cache region needs to be established in a matrix abstract layer in the matrix object storage system, and is used for reducing the access times of the object storage system; in addition, if too many query requests are forwarded to a host, the workload of the host is also increased.
Therefore, establishing the hot spot area image cache suitable for the new storage architecture is the key for improving the image access speed of the matrix-oriented object storage system.
Disclosure of Invention
Aiming at the defects or the improvement requirements in the prior art, the invention provides a self-adaptive distributed remote sensing image caching and retrieval method, which aims to use a distributed cache to manage the cache so that the workload of each host tends to be in a balanced state and improve the speed of responding to the query request of a user to the remote sensing image by a matrix object oriented storage system.
In order to achieve the above object, the present invention provides a self-adaptive distributed remote sensing image caching method, including:
s1, selecting a remote sensing image to be cached in a distributed cache from remote sensing images stored in a distributed object storage database according to the sending time, the geographic spatial range and the time range of a space vector query request;
s2, storing the selected remote sensing image in a distributed cache; the distributed cache is a cache system composed of memories currently available to a plurality of hosts.
Further, in step S1, specifically,
aiming at remote sensing image query requests in the same time, the same geographic space range and the same time range, when the number of the requests is larger than a first threshold value N, obtaining the remote sensing image from a distributed object storage database and caching the remote sensing image;
aiming at remote sensing image query requests in the same geographic space range and the same time range at different moments, when the request times are larger than a first threshold value N in a specified time period, obtaining remote sensing images from a distributed object storage database and caching the remote sensing images;
caching the remote sensing images with the query frequency exceeding a second set threshold value M aiming at the remote sensing image query requests in different moments, the same geographic space range and different time ranges; when the query frequencies of two or more geographic space vectors are the same, determining to cache the corresponding remote sensing image according to the inclusion relation of the space vectors: when the space vector A contains the space vector B, caching the remote sensing image corresponding to the space vector A; and when the space vectors do not contain the relation, caching all the remote sensing images corresponding to the space vectors A and B.
Further, the query frequency of the space vector is obtained by dividing the number of visits by a set time interval.
Further, when the number of images managed by the distributed cache exceeds a preset upper limit or when the response delay of the image query request is long, a clearing mechanism of the distributed cache is triggered, and the remote sensing images which are least frequently used in a set time are deleted to store new images with high access frequency.
The invention also provides a self-adaptive distributed remote sensing image retrieval method, which comprises the following steps:
s1, selecting a least used storage manager in a set time period by an image access interface layer to process a space vector query request;
s2, the storage manager searches whether a remote sensing image corresponding to the space vector query request exists in a corresponding host memory; each host corresponds to a storage manager and is responsible for storing the remote sensing images into a distributed cache system and retrieving tasks;
01. if the remote sensing image exists, the remote sensing image is directly returned by the storage manager;
02. if the remote sensing image does not exist, sending a space vector query request to the distributed cache, sending the query request to the distributed object storage database by the distributed cache, retrieving the remote sensing image from the distributed object storage database, returning a result, and informing a current host that the result is returned;
03. and if the distributed cache does not return a result within the specified time range, the storage manager running on the host computer directly sends a query request to the distributed object storage database, and the remote sensing image is stored in the distributed cache while being returned.
Further, before sending the query request to the distributed object store database, the method further includes:
and combining a plurality of space vector query requests covered by the same remote sensing image to form a combined query subregion set.
Further, the method further comprises:
judging whether the inquired sub-region is in the combined inquired sub-region set or not, and returning False if the inquired sub-region is not in the combined inquired sub-region set; if the data exists, calculating the offset position of the sub-region in the one-dimensional array corresponding to the Cartesian product by obtaining the sum of the differences of the starting column and the ending column of each sub-region and the unit number of each sub-region, and obtaining the data of each sub-region.
In general, the above technical solutions contemplated by the present invention can achieve the following advantageous effects compared to the prior art.
(1) The method stores the remote sensing image with high frequency access, form information corresponding to the image, query acceptance time of a space query vector, request times and other information in a distributed cache; by adding a storage manager in a matrix abstract layer on a host of an object storage system operating matrix-oriented, the storage manager sends high-frequency access remote sensing images to other hosts, so that the high-frequency access remote sensing images are copied and stored in a distributed cache, and the response speed of the system to the remote sensing images to be searched by space query vector requests is improved when multiple clients access simultaneously.
(2) The storage manager maps the image access request from the image access interface layer into the distributed cache by combining the query requests of the image blocks, so that the times of accessing the object storage system are reduced, and the query waiting time of the image access interface layer is reduced.
(3) According to the distributed cache disclosed by the invention, when the number of the managed images exceeds the preset upper limit or the response delay of the image block query request is prolonged, a clearing mechanism of the distributed cache is triggered; when the image access interface layer forwards the space query vector request, the image query request is forwarded to the hosts with high response speed, so that the time required by system response can be reduced.
Drawings
FIG. 1 is a diagram of a matrix object storage system architecture with the addition of a distributed cache system;
FIG. 2 is a flow chart of automated clearing of remote sensing image patches;
FIG. 3 is a flow chart of writing a remote sensing image in a hot spot area into a distributed cache in blocks;
FIG. 4 is a schematic diagram of a simulation query of four image partitions in a distributed cache;
FIG. 5 is a flowchart of an algorithm for obtaining data for each sub-region from the merged query result.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
As shown in fig. 1, after the space vector query request is subjected to mapping conversion by the image access interface layer, the remote sensing image covering the geospatial range is queried by analyzing the geospatial description information of the space vector query request at the matrix abstraction layer. Before adding distributed cache, a storage manager in the matrix abstraction layer is a module running on each host, the module receives a space query vector request in an equal probability mode, the space query vector request is converted into a request format required by an object storage system interface layer through the matrix abstraction layer, and a remote sensing image is retrieved through the object storage system interface layer. After the distributed cache is added, the space vector query request is retrieved from the distributed cache, and when the space query vector and the time range constraint condition are met, all images are retrieved from the distributed cache system; when the time range is partially satisfied and the time range is partially not satisfied, acquiring the satisfied part from the distributed cache, and still accessing the rest part through the interface layer of the previous object storage system.
The storage space of the distributed cache system provided by the invention is composed of the available memory space of the storage manager running on a plurality of hosts. Because of the emphasis on coordinated storage among multiple hosts, the distributed cache system appears to be a huge pool of storage resources. The caching method is mainly characterized in that after the image blocks are cached in a distributed mode, if the same space vector query request arrives next time, a query result is quickly returned, namely a Redis distributed caching system is adopted, metadata of the remote sensing image is used as a key, and entity data of the remote sensing image is stored in the Redis distributed caching system as a value; in the distributed system, information such as the access frequency of the remote sensing image, the number of the hosts on line at present, the IP address and the port number of the peer host and the like is stored and updated between the hosts through the gossip protocol.
The invention provides a self-adaptive distributed remote sensing image caching method, which comprises the following steps:
s1, selecting a remote sensing image to be cached in a distributed cache from remote sensing images stored in a distributed object storage database according to the sending time, the geographic spatial range and the time range of a space vector query request;
step S1 is specifically to, for remote sensing image query requests in the same time, the same geographic space range, and the same time range, when the number of the requests is greater than the first threshold N, obtain a remote sensing image from the distributed object storage database and cache the remote sensing image; aiming at remote sensing image query requests in the same geographic space range and the same time range at different moments, when the request times are greater than a first threshold value N in a specified time period, obtaining remote sensing images from a distributed object storage database and caching the remote sensing images; caching the remote sensing images with the query frequency exceeding a second set threshold value M aiming at the remote sensing image query requests in different moments, the same geographic space range and different time ranges; when the query frequency of two or more geographic space vectors is the same, determining to cache the corresponding remote sensing images according to the inclusion relation of the space vectors: when the space vector A contains the space vector B, caching the remote sensing image corresponding to the space vector A; and when the spatial vectors do not contain the relation, caching all the remote sensing images corresponding to the spatial vectors A and B. Wherein the query frequency of the space vector is obtained by dividing the number of visits by a set time interval, for example, a week or a month.
S2, storing the selected remote sensing image in a distributed cache; the distributed cache is a cache system composed of memories currently available to a plurality of hosts.
If the distributed cache has no redundant space for storing the cache, the remote sensing image which is used least frequently is deleted to store a new image with high access frequency. Suppose that the access frequency of the remote sensing image is according to an exponential function (1/e)x) Decrease, then use formula ft+1=(1/eft) The current number of accesses at time t +1 can be approximated. When the number of accesses is reduced to be close to 0, the remote sensing image is cleared, as shown in fig. 2, and a background daemon process and a timer can be used for realizing the remote sensing image in the actual implementation process. The conditions triggering the event includeThe conditions of cache clearing and the like required by operation and maintenance management are achieved.
The invention also provides a self-adaptive distributed remote sensing image retrieval method, which comprises the following steps:
s1, selecting a least used storage manager in a set time period by an image access interface layer to process a space vector query request;
the storage manager is a process running on each host because of performance differences between hosts, e.g., the number of CPUs, the size of memory, and the bandwidth in accessing the distributed object store may vary from host to host. If the storage manager on one host with poor performance receives more space vector query requests, and the storage managers on other hosts with good performance always wait for the space vector query requests, the overall query load is unbalanced. The image access interface layer designed by the invention records the sending time, the response time, the IP address and the port number of the response host of each space vector query request, and selects the least used storage manager to process the space vector query request according to the least recently used principle.
S2, the storage manager searches whether a remote sensing image corresponding to the space vector query request exists in a corresponding host memory; each host corresponds to a storage manager and is responsible for storing the remote sensing images into a distributed cache system and retrieving tasks;
as shown in fig. 3, if the remote sensing image requested to be queried exists in the host, the remote sensing image is directly returned by the storage manager corresponding to the host; if the remote sensing image does not exist, sending a space vector query request to the distributed cache, sending the query request to the distributed object storage database by the distributed cache, retrieving the remote sensing image from the distributed object storage database, returning a result, and informing a current host that the result is returned; and if the distributed cache does not return a result within the specified time range, the storage manager running on the host computer directly sends a query request to the distributed object storage database, and the remote sensing image is returned and stored into the distributed cache.
Before sending query requests to a distributed object storage database, merging the query requests aiming at a plurality of space vectors covered by the same remote sensing image so as to minimize the number of times of external storage access;
the remote sensing image stored in the object storage system, that is, the remote sensing image object retrieved from the object storage system, for example, the width of a domestic GF-3 satellite may be as high as 650 km, and from the perspective of a matrix abstraction layer, the remote sensing image object is a multidimensional giant matrix. However, in the actual process of satellite remote sensing application, local several or dozens of matrix small areas are needed; for example, images of wuhan areas are matrix objects with large length and width in an object storage system, and it is assumed that an application which needs to monitor the water quality of lakes such as east lake, south lake, thomson lake, and the like periodically sends a space vector query request of a corresponding area through a Web browser, and after mapping conversion of an image access interface layer, all the applications need to be the same image. If the region query request is not merged, the matrix abstraction layer respectively requests to obtain the image block of the east lake region, the image block of the south lake region and the image block of the Tangson lake region. However, this establishes and opens three communication connections with the object storage system. Assuming that the overhead of each communication connection is T, the overhead of satisfying three space vector queries is 3T. In order to avoid establishing communication with the distributed object storage database when a specific sub-region of the remote sensing image is inquired each time, the method optimizes the problem by utilizing the merged region inquiry, and merges the space vector inquiry of different regions of the same image into one inquiry so as to reduce the access times of the distributed object storage database. The requests for these sub-regions are combined into one request. As shown in fig. 4, the two-dimensional matrix assumes that the four subregions to be accessed by the remote sensing image are S1 ═ {2,3,1,2}, S2 ═ 2,3,5,6}, S3 ═ 5,5,1,2} and S4 ═ 5,5,5,6}, and corresponds to the black frame portions in the figure. Assuming that the time overhead for establishing connection with the remote device is T and the time for transmitting data is a constant C, the time for which the four regions do not undergo merge query is 4(T + C), the time overhead for merge query is T + C, and the rate is increased by 3 times. For a remote sensing image with a large coverage space area, the corresponding matrix is extremely huge, the times of acquiring the sub-area from the image are also obviously increased, and accordingly, the efficiency of query service is greatly improved.
The result of the merged queries is the Cartesian product of the image sub-regions, which is essentially a one-dimensional array. To facilitate understanding of the specific meaning of the cartesian product result, assuming that there are two image sub-regions query requests, S1 ═ {0,2,5,7}, S2 ═ 2,3,4,6}, then the returned cartesian product result is in the form of a one-dimensional array of four image sub-regions {0,2,5,7} {0,2,4,6} {2,3,5,7} {2,3,4,6 }. As shown in fig. 5: the flow chart for extracting the subarea blocks comprises the following steps: and judging whether the sub-regions are in the combined query sub-region set or not, and if not, returning False. If the partial region exists, returning the offset of the partial region to be extracted in the combined query partial region set. As the Cartesian product results in the same number of matrix units contained in each logic row, the number of matrix units contained in each row can be obtained by combining the initial differences of the columns in the query subarea set. Then, calculating the offset position of the one-dimensional array where the sub-region is located, wherein the offset position is equal to the number of matrix units included in each row multiplied by the row start difference of all the rows which is smaller than the offset of the sub-region to be extracted in the combined query sub-region set. Then, calculating the offset position of the row where the sub-region is located, wherein the offset position is equal to the sum of the column starting differences of the offsets of the sub-regions in the combined query sub-region set; and finally, obtaining the offset position of the pixel data of each region in the one-dimensional array according to the row offset position and the column initial position of the extracted sub-region block. After the information of each sub-region is obtained, the image access interface layer returns corresponding image data for each space vector query request.
In order to investigate whether the distributed caching method for the remote sensing images of the hot spot regions meets the requirement of millisecond-level response for inquiring the images of the hot spot regions, the embodiment of the invention verifies five machines of an Intel (R) core (TM) i7 CPU 950@3.07GHz machine, 32GB of memory and 500GB of magnetic disk. Meanwhile, the host storage manager, the distributed cache system and the object storage database are all installed on the five machines, a device networking with a Hua 3 switch as a throughput rate of 1000MB/s is adopted, a Sentinel2 satellite 100 image from 2018 to 2019 is inquired by taking Bingfeng county in Enshi autonomous states of Hubei as a space inquiry vector, and the reading delay and the throughput rate of a plurality of clients requesting the same area for multiple times are respectively observed. The read latency refers to the average time from the request of the client to the response, and the throughput rate refers to the total time spent by the system for all the images returned by the user when the storage manager receives a plurality of multi-thread concurrent accesses. As shown in table 1, when the number of threads (indicating the number of concurrent users) increases, the read latency decreases rapidly, and the throughput increases gradually in the opposite direction. When the data of the image block is read for the first time, the result is returned after delaying for about 1s, and the subsequent read request returns the image block stored in the memory, so the read delay does not have a rising trend. In the aspect of throughput rate, since the cache image block is directly returned from the local memory, the throughput rate is correspondingly increased. As can be seen from table 1, by adding distributed caches in the matrix-oriented object storage system, the query time for high frequency access to the same partition is gradually reduced.
TABLE 1
Number of threads Read delay (ms) Throughput rate (MB/s)
10 1111 1540
20 517 2114
30 223 2456
40 168 2890
50 35 3420
The size of the L2 grade product of the Sentinel2 satellite images was 1.02GB, and 320 sub-regions of each image were acquired in the Fengsheng county, Enshi. When 100 users access concurrently, the time overhead for not performing the merge query is 1253ms, and the time overhead for the merge query is 58 ms. This result shows that the merging sub-region query algorithm can effectively reduce the service response time of the space vector query request.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (7)

1. A self-adaptive distributed remote sensing image caching method is characterized by comprising the following steps:
s1, selecting a remote sensing image to be cached in a distributed cache from remote sensing images stored in a distributed object storage database according to the sending time, the geographic spatial range and the time range of a space vector query request;
s2, storing the selected remote sensing image in a distributed cache; the distributed cache is a cache system composed of memories currently available to a plurality of hosts.
2. The method for caching self-adaptive distributed remote sensing images according to claim 1, wherein step S1 is specifically,
aiming at remote sensing image query requests in the same time, the same geographic space range and the same time range, when the number of the requests is larger than a first threshold value N, obtaining the remote sensing images from a distributed object storage database and caching the remote sensing images;
aiming at remote sensing image query requests in the same geographic space range and the same time range at different moments, when the request times are greater than a first threshold value N in a specified time period, obtaining remote sensing images from a distributed object storage database and caching the remote sensing images;
caching the remote sensing images with the query frequency exceeding a second set threshold value M aiming at the remote sensing image query requests in different moments, the same geographic space range and different time ranges; when the query frequency of two or more geographic space vectors is the same, determining to cache the corresponding remote sensing images according to the inclusion relation of the space vectors: when the space vector A contains the space vector B, caching the remote sensing image corresponding to the space vector A; and when the space vectors do not contain the relation, caching all the remote sensing images corresponding to the space vectors A and B.
3. The adaptive distributed remote sensing image caching method according to claim 2, wherein the query frequency of the space vector is obtained by dividing the number of times of access by a set time interval.
4. The self-adaptive distributed remote sensing image caching method according to claim 2, wherein when the number of images managed by the distributed cache exceeds a preset upper limit or when the response delay of an image query request becomes long, a distributed cache clearing mechanism is triggered, and the remote sensing images which are least frequently used within a set time are deleted to store new images with high access frequency.
5. A self-adaptive distributed remote sensing image retrieval method is characterized by comprising the following steps:
s1, selecting a least used storage manager in a set time period by an image access interface layer to process a space vector query request;
s2, the storage manager searches whether a remote sensing image corresponding to the space vector query request exists in a corresponding host memory; each host corresponds to a storage manager and is responsible for storing the remote sensing images into a distributed cache system and retrieving tasks;
01. if the remote sensing image exists, the remote sensing image is directly returned by the storage manager;
02. if the remote sensing image does not exist, sending a space vector query request to the distributed cache, sending the query request to the distributed object storage database by the distributed cache, retrieving the remote sensing image from the distributed object storage database, returning a result, and informing a current host that the result is returned;
03. and if the distributed cache does not return a result within the specified time range, the storage manager running on the host computer directly sends a query request to the distributed object storage database, and the remote sensing image is returned and stored into the distributed cache.
6. The method of claim 5, wherein prior to sending the query request to the distributed object storage database, the method further comprises:
and combining a plurality of space vector query requests covered by the same remote sensing image to form a combined query subregion set.
7. The method of claim 6, further comprising:
judging whether the inquired sub-region is in the combined inquired sub-region set or not, and if not, returning False; if the data exists, calculating the offset position of the sub-region in the one-dimensional array corresponding to the Cartesian product by obtaining the sum of the differences of the starting column and the ending column of each sub-region and the unit number of each sub-region, and obtaining the data of each sub-region.
CN202011341421.2A 2020-11-25 2020-11-25 Self-adaptive distributed remote sensing image caching and searching method Active CN112395453B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011341421.2A CN112395453B (en) 2020-11-25 2020-11-25 Self-adaptive distributed remote sensing image caching and searching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011341421.2A CN112395453B (en) 2020-11-25 2020-11-25 Self-adaptive distributed remote sensing image caching and searching method

Publications (2)

Publication Number Publication Date
CN112395453A true CN112395453A (en) 2021-02-23
CN112395453B CN112395453B (en) 2024-03-19

Family

ID=74603909

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011341421.2A Active CN112395453B (en) 2020-11-25 2020-11-25 Self-adaptive distributed remote sensing image caching and searching method

Country Status (1)

Country Link
CN (1) CN112395453B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934759A (en) * 2022-11-30 2023-04-07 二十一世纪空间技术应用股份有限公司 Accelerated computing method for massive multi-source heterogeneous satellite data query

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160267132A1 (en) * 2013-12-17 2016-09-15 Hewlett-Packard Enterprise Development LP Abstraction layer between a database query engine and a distributed file system
CN106126604A (en) * 2016-06-20 2016-11-16 华南理工大学 A kind of social security data log analysis process system based on Distributed Data Warehouse
CN111125392A (en) * 2019-12-25 2020-05-08 华中科技大学 Remote sensing image storage and query method based on matrix object storage mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160267132A1 (en) * 2013-12-17 2016-09-15 Hewlett-Packard Enterprise Development LP Abstraction layer between a database query engine and a distributed file system
CN106126604A (en) * 2016-06-20 2016-11-16 华南理工大学 A kind of social security data log analysis process system based on Distributed Data Warehouse
CN111125392A (en) * 2019-12-25 2020-05-08 华中科技大学 Remote sensing image storage and query method based on matrix object storage mechanism

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934759A (en) * 2022-11-30 2023-04-07 二十一世纪空间技术应用股份有限公司 Accelerated computing method for massive multi-source heterogeneous satellite data query
CN115934759B (en) * 2022-11-30 2023-12-22 二十一世纪空间技术应用股份有限公司 Acceleration calculation method for massive multi-source heterogeneous satellite data query

Also Published As

Publication number Publication date
CN112395453B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
US7797275B2 (en) System and method of time-based cache coherency maintenance in user file manager of object-based storage system
CN107346307B (en) Distributed cache system and method
EP2062123B1 (en) Automatic load spreading in a clustered network storage system
US6370620B1 (en) Web object caching and apparatus for performing the same
US8463846B2 (en) File bundling for cache servers of content delivery networks
US6732117B1 (en) Techniques for handling client-oriented requests within a data storage system
US10025718B1 (en) Modifying provisioned throughput capacity for data stores according to cache performance
JP2004511840A (en) Replacement management of data in one node's cache based on another node's cache
US10057368B1 (en) Method and system for incremental cache lookup and insertion
CN105635196A (en) Method and system of file data obtaining, and application server
WO2023185770A1 (en) Cloud data caching method and apparatus, device and storage medium
CN114817195A (en) Method, system, storage medium and equipment for managing distributed storage cache
CN114844846A (en) Multi-level cache distributed key value storage system based on programmable switch
CN112395453B (en) Self-adaptive distributed remote sensing image caching and searching method
US11055223B2 (en) Efficient cache warm up based on user requests
CN112559459B (en) Cloud computing-based self-adaptive storage layering system and method
CN107450860B (en) Map file pre-reading method based on distributed storage
CN116541553A (en) Video scheduling method, device, equipment and readable storage medium
CN113660336B (en) Cloud computing and fog computing system using KV storage device
JPH06290090A (en) Remote file accessing system
JPH05143435A (en) Data base system
CN105930519A (en) Globally shared read caching method based on cluster file system
CN117539915B (en) Data processing method and related device
JP2001256098A (en) Method for controlling cache in proxy server
CN112637327B (en) Data processing method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant