CN116644202A - Method and device for storing large-data-volume remote sensing image data - Google Patents

Method and device for storing large-data-volume remote sensing image data Download PDF

Info

Publication number
CN116644202A
CN116644202A CN202310667083.9A CN202310667083A CN116644202A CN 116644202 A CN116644202 A CN 116644202A CN 202310667083 A CN202310667083 A CN 202310667083A CN 116644202 A CN116644202 A CN 116644202A
Authority
CN
China
Prior art keywords
data
nodes
remote sensing
dividing
sensing image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310667083.9A
Other languages
Chinese (zh)
Inventor
周宁
杨毅
钟普天
刘宁山
徐斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yizhirui Information Technology Co ltd
Original Assignee
Beijing Jietai Yunji Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jietai Yunji Information Technology Co ltd filed Critical Beijing Jietai Yunji Information Technology Co ltd
Priority to CN202310667083.9A priority Critical patent/CN116644202A/en
Publication of CN116644202A publication Critical patent/CN116644202A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to the field of big data, in particular to a method and a device for storing big data remote sensing image data, wherein the method comprises the following steps: dividing data into N segments; dividing a storage space into M nodes; setting one node of the M nodes as a configuration node, and setting the rest M-1 nodes as working nodes for storing data; and storing the data of the N fragments to M-1 corresponding working nodes storing the data by using the configuration node. Further comprises: k working nodes are added; the configuration node distributes the data storage of M+K-1 nodes after the working node is added. The application has the effects of improving efficiency in the storage process and ensuring the stability and reliability of the system.

Description

Method and device for storing large-data-volume remote sensing image data
Technical Field
The application relates to the field of big data, in particular to a method and a device for storing big data quantity remote sensing image data.
Background
In the field of remote sensing space data storage at present, when the data volume in a library reaches tens of millions or even hundreds of millions, data retrieval (attribute retrieval, space aggregation and the like) can take longer time, so that the response of a system is slow and an interface is overtime. The conventional database splitting table is a technology for horizontally splitting data in a relational database. In this case, the main goal is to divide a large amount of spatial data into smaller data blocks, which are then distributed into different databases to improve the performance and scalability of the databases. Typically, when the data size reaches an acceptable range, a split library and split table is required. This technique is based on the concept of data dispersion, splitting a large database into multiple smaller databases, and then dispersing spatial data into these databases. Splitting the library into tables increases the number of components in the system, resulting in more management and configuration effort, which can increase the difficulty of overall system development, deployment, and maintenance. The target data may be scattered in various libraries, and space operations such as aggregation and the like need to be performed for multiple operations to obtain a result, so that the storage efficiency is not obviously improved.
Disclosure of Invention
The application provides a method and a device for storing large-data-volume remote sensing image data, which aim to solve the technical problem that efficiency is not obviously improved in the storage process.
The application provides a storage method of large-data-volume remote sensing image data, which adopts the following technical scheme:
in a first aspect, a method for storing large-data-volume remote sensing image data is provided, including:
dividing the large-data-volume remote sensing image data into N segments;
dividing a storage space into M nodes;
setting one node of the M nodes as a configuration node, and setting the remaining M-1 nodes as working nodes for storing large-data-volume remote sensing image data;
and storing the data of the N fragments to M-1 corresponding working nodes for storing the remote sensing image data with large data volume by using the configuration nodes.
Preferably, the method further comprises:
k working nodes are added;
the configuration node distributes the data storage of M+K-1 nodes after the working node is added.
Preferably, the method further comprises:
and copying P parts of the N fragments, and distributing the N fragments to M-1 working nodes.
Preferably, the dividing the large data volume remote sensing image data into N segments includes:
creating a distributed index;
dividing the table into a plurality of fragments according to the structure of the distributed index;
the data is divided into N segments and added to a corresponding table divided into a plurality of segments.
Preferably, the distributed index includes: distributed B-Tree and/or hash index.
In a second aspect, there is also provided a large data volume remote sensing image data storage device, including:
a first dividing module: the method comprises the steps of dividing the large-data-volume remote sensing image data into N segments;
a second dividing module: dividing the storage space into M nodes;
and (3) a setting module: the method comprises the steps that one node of M nodes is set as a configuration node, and the remaining M-1 nodes are set as working nodes for storing large-data-volume remote sensing image data;
a first storage module: and the configuration nodes are used for storing the large data volume remote sensing image data of the N fragments to the corresponding M-1 working nodes for storing the large data volume remote sensing image data.
Preferably, the method further comprises:
and (3) an adding module: for adding K working nodes;
and a second storage module: the data storage of M+K-1 nodes after the configuration node is added with the working node is allocated.
Preferably, the method further comprises:
and (3) a replication module: for copying the N segments into P parts and distributing the P parts to M-1 working nodes.
Preferably, the first dividing module includes: the creation module: for creating a distributed index;
the distribution module: dividing the table into a plurality of segments according to a structure of the distributed index;
and a third storage module: the method is used for dividing the large-data-volume remote sensing image data into N segments and adding the N segments into a corresponding table divided into a plurality of segments.
Preferably, the distributed index includes: distributed B-Tree and/or hash index.
In summary, the present application includes at least one of the following beneficial technical effects:
1. efficient storage of large-scale data may be supported. In the aspect of storing the space remote sensing metadata, the data can be segmented horizontally and different data can be stored on different nodes in a scattered mode, so that the problems of single-point faults and data inclination are avoided, and the reliability and the expandability of the whole system are improved.
2. The data may be scattered over different nodes in a spatial hash fashion. In the aspect of storing the space remote sensing metadata, the data can be hashed according to space coordinates by using a space hash algorithm, so that adjacent data are divided into the same node. Thus, the query efficiency can be improved, the network overhead is reduced, and the performance of the whole system is improved.
3. The node number and the data slicing can be adjusted in real time according to the increase of the data volume, so that the stability and the reliability of the system are ensured.
Drawings
FIG. 1 is a diagram of a first embodiment of a method for storing large data volume remote sensing image data;
FIG. 2 is a diagram of a second embodiment of a method for storing large data volume remote sensing image data;
FIG. 3 is a diagram of a third embodiment of a method for storing large data volume remote sensing image data;
FIG. 4 is a diagram of an embodiment of dividing data into N segments;
FIG. 5 is a diagram of a first embodiment of a large data volume remote sensing image data storage device;
FIG. 6 is a diagram of a second embodiment of a large data volume remote sensing image data storage device;
FIG. 7 is a diagram of a third embodiment of a large data volume remote sensing image data storage device;
fig. 8 is a diagram of an embodiment of a first partitioning module.
Reference numerals illustrate: 1. a large data volume remote sensing image data storage device; 11. a first dividing module; 12. a second dividing module; 13. setting a module; 14. a first storage module; 15. adding a module; 16. a second storage module; 17. a replication module; 111. creating a module; 112. a distribution module; 113. and a third memory module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings 1 to 8 and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The application provides a data storage method, which adopts the following technical scheme:
in a first aspect, referring to fig. 1, a method for storing large-data-volume remote sensing image data is provided, including:
s1: dividing the large data volume remote sensing image data into N segments; the remote sensing image data with large data volume has large data volume, so that the segmentation processing is needed, if the data is not segmented, the situation of cross nodes or servers in the data storage process can be caused, and at the moment, although the storage is convenient, various problems such as overtime access can occur in the process of access.
S2: dividing a storage space into M nodes; the storage space includes a variety of storage servers or storage media. Multiple memories may be divided into a single node; a single server may also be divided into multiple nodes. In this embodiment, a plurality of servers in a storage space are divided into M nodes; the single node may include a single server or multiple servers. The purpose of the division of the storage space into M nodes is to accommodate the division of the data into N segments, i.e. to enable the data divided into N segments to be appropriately allocated to the M nodes of the storage space.
S3: setting one node of the M nodes as a configuration node, and setting the rest M-1 nodes as working nodes for storing data; only one node is needed for the configuration node. However, a node may include multiple servers, or just one server. The configuration node mainly aims at coordinating the work of data distribution, storage, retrieval and access of the remaining M-1 nodes. The function of the working node is mainly to store data. As for how to store data, it is the main work of the configuration node, for example, the configuration node segments the data according to the length of the data, then matches the data segment with the storage space according to the existing storage space of the working node storing the data, and distributes different data segments to the servers of different working nodes according to the appropriate matching rules.
S4: and storing the large data quantity remote sensing image data of the N fragments to M-1 corresponding working nodes for storing the large data quantity remote sensing image data by using the configuration nodes. The purpose of this step is to better store the remote sensing image data of large data volume.
Preferably, as shown with reference to fig. 2, further comprising:
s5: k working nodes are added; in the actual production process, the lack of a server can cause insufficient data storage capacity at any time. Then, K working nodes need to be added, that is, K working nodes not in the management domain of the original configuration node are added. Then, how to reasonably store data on the added K working nodes is a technical problem to be solved in this embodiment.
S6: the configuration node distributes the data storage of M+K-1 nodes after the working node is added. K nodes are added as working nodes, and then the positions of the K nodes need to be configured in the configuration node. I.e. the information of the K nodes is to be distributed among the configuration nodes. The specific contents are as follows: scanning the tables, namely scanning all tables needing to be balanced again, and recording the partition number of each table. Determining the number of partitions: for each table, the number of all its partitions is calculated and the number of these partitions is recorded. Determining allocation policy-the allocation policy will be determined based on the number of partitions per table. For example, a load balancing based allocation policy may be used if all the partitions of a certain table are the same, and an importance based allocation policy may be used if some of the partitions of a certain table are many more than others. The allocation task will apply an allocation policy to each table and allocate the query task to the most appropriate node. This process involves a number of algorithms and techniques, such as load balancing algorithms, distance algorithms, importance algorithms, and the like. Adjusting task distribution: after the task is distributed, the task distribution condition is checked and adjusted according to the requirement. For example, if the number of tasks on a node is excessive, some tasks may be distributed to other nodes to balance the task distribution. Repeating the above steps until the number of partitions in all tables are rebalanced.
Preferably, as shown with reference to fig. 3, further comprising:
s7: and copying P parts of the N fragments, and distributing the N fragments to M-1 working nodes. The copy number P is increased, namely, the copy number is increased. Duplicate sets may increase the availability and performance of the database because when one node fails, other nodes may take over its work. At the same time, the set of replicas may also increase redundancy and security of the data, as any node may provide a replica of the data. However, an increase in the number of copies also increases the storage capacity and the demand for computing resources. And, copies would be distributed to M-1 worker nodes in order to guarantee the respective security and redundancy of the copies of the N segments.
Preferably, referring to fig. 4, the dividing the data into N segments includes:
s11: creating a distributed index;
s12: dividing the table into a plurality of fragments according to the structure of the distributed index; and carrying out index creation operation on the table to create a distributed index. The table is divided into a plurality of fragments according to the structure of the index. The size of each segment is fixed and can be set according to actual requirements.
S13: and dividing the large-data-volume remote sensing image data into N fragments, and adding the N fragments into a corresponding table divided into a plurality of fragments. The data page is loaded into the corresponding fragment. The data pages are fragmented core data structures, each data page containing a number of data lines. The status of the fragment needs to be updated when the data page is inserted, updated, deleted.
Preferably, the distributed index includes: distributed B-Tree and/or hash index. These indexes may be partitioned across multiple nodes. This enables the present embodiment to handle tables with a large number of rows and complex geometric objects. The spatial table is horizontally partitioned so that different spatial data may be stored on different nodes. This enables the present embodiment to accelerate spatial queries and support larger data sets. The method can automatically process the query in parallel on a plurality of nodes, improves the parallel execution capacity of the spatial data operation, and makes the query and analysis more efficient. The ability to extend horizontally is provided, which can be easily extended to multiple nodes, so that the overall system can be easily extended to handle more data needs. The data may be scattered over different nodes in a spatial hash fashion. In the aspect of storing the space remote sensing metadata, the data can be hashed according to space coordinates by using a space hash algorithm, so that adjacent data are divided into the same node. Thus, the query efficiency can be improved, the network overhead is reduced, and the performance of the whole system is improved.
In a second aspect, referring to fig. 5, there is also provided a large data volume remote sensing image data storage device 1, including:
the first division module 11: the method comprises the steps of dividing the large-data-volume remote sensing image data into N segments;
the second dividing module 12: dividing the storage space into M nodes;
the setting module 13: the method comprises the steps that one node of M nodes is set as a configuration node, and the remaining M-1 nodes are set as working nodes for storing large-data-volume remote sensing image data;
the first storage module 14: for storing the data of the N segments to the corresponding M-1 working nodes storing the data using the configuration node.
Preferably, referring to fig. 6, further comprising:
the adding module 15: for adding K working nodes;
the second storage module 16: the data storage of M+K-1 nodes after the configuration node is added with the working node is allocated.
Preferably, as shown with reference to fig. 7, further comprising:
replication module 17: for copying the N segments into P parts and distributing the P parts to M-1 working nodes.
Preferably, referring to fig. 8, the first dividing module 11 includes:
the creation module 111: for creating a distributed index;
the allocation module 112: dividing the table into a plurality of segments according to a structure of the distributed index;
the third storage module 113: the method is used for dividing the large-data-volume remote sensing image data into N fragments and adding the N fragments into a corresponding table divided into a plurality of fragments.
Preferably, the distributed index includes: distributed B-Tree and/or hash index.
In summary, the present application includes at least one of the following beneficial technical effects:
1. efficient storage of large-scale data may be supported. In the aspect of storing the space remote sensing metadata, the data can be segmented horizontally and different data can be stored on different nodes in a scattered mode, so that the problems of single-point faults and data inclination are avoided, and the reliability and the expandability of the whole system are improved.
2. The data may be scattered over different nodes in a spatial hash fashion. In the aspect of storing the space remote sensing metadata, the data can be hashed according to space coordinates by using a space hash algorithm, so that adjacent data are divided into the same node. Thus, the query efficiency can be improved, the network overhead is reduced, and the performance of the whole system is improved.
3. The node number and the data slicing can be adjusted in real time according to the increase of the data volume, so that the stability and the reliability of the system are ensured.
The foregoing description of the preferred embodiments of the application is not intended to limit the scope of the application in any way, including the abstract and drawings, in which case any feature disclosed in this specification (including abstract and drawings) may be replaced by alternative features serving the same, equivalent purpose, unless expressly stated otherwise. That is, each feature is one example only of a generic series of equivalent or similar features, unless expressly stated otherwise.

Claims (10)

1. The method for storing the large-data-volume remote sensing image data is characterized by comprising the following steps of:
dividing the data into N segments;
dividing a storage space into M nodes;
setting one node of the M nodes as a configuration node, and setting the remaining M-1 nodes as working nodes for storing large-data-volume remote sensing image data;
and storing the data of the N fragments to M-1 corresponding working nodes for storing the remote sensing image data with large data volume by using the configuration nodes.
2. The method as recited in claim 1, further comprising:
k working nodes are added;
the configuration node distributes the data storage of M+K-1 nodes after the working node is added.
3. The method as recited in claim 1, further comprising:
and copying P parts of the N fragments, and distributing the N fragments to M-1 working nodes.
4. The method of claim 1, wherein the dividing the data into N segments comprises:
creating a distributed index;
dividing the table into a plurality of fragments according to the structure of the distributed index;
dividing the large data volume remote sensing image data into N segments, and adding the N segments into a corresponding table divided into a plurality of segments.
5. The method of claim 4, wherein the distributed index comprises: distributed B-Tree and/or hash index.
6. A storage device for large data volume remote sensing image data, comprising:
a first dividing module: the method comprises the steps of dividing data of large-data-volume remote sensing image data into N segments;
a second dividing module: dividing the storage space into M nodes;
and (3) a setting module: the method comprises the steps that one node of M nodes is set as a configuration node, and the remaining M-1 nodes are set as working nodes for storing large-data-volume remote sensing image data;
a first storage module: and the M-1 working nodes are used for storing the data of the N fragments to the corresponding remote sensing image data with large data volume by using the configuration nodes.
7. The apparatus as recited in claim 6, further comprising:
and (3) an adding module: for adding K working nodes;
and a second storage module: the data storage of M+K-1 nodes after the configuration node is added with the working node is allocated.
8. The apparatus as recited in claim 6, further comprising:
and (3) a replication module: for copying the N segments into P parts and distributing the P parts to M-1 working nodes.
9. The apparatus of claim 6, wherein the first partitioning module comprises: the creation module: for creating a distributed index;
the distribution module: dividing the table into a plurality of segments according to a structure of the distributed index;
and a third storage module: the method is used for dividing the large-data-volume remote sensing image data into N fragments and adding the N fragments into a corresponding table divided into a plurality of fragments.
10. The apparatus of claim 9, wherein the distributed index comprises: distributed B-Tree and/or hash index.
CN202310667083.9A 2023-06-06 2023-06-06 Method and device for storing large-data-volume remote sensing image data Pending CN116644202A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310667083.9A CN116644202A (en) 2023-06-06 2023-06-06 Method and device for storing large-data-volume remote sensing image data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310667083.9A CN116644202A (en) 2023-06-06 2023-06-06 Method and device for storing large-data-volume remote sensing image data

Publications (1)

Publication Number Publication Date
CN116644202A true CN116644202A (en) 2023-08-25

Family

ID=87624496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310667083.9A Pending CN116644202A (en) 2023-06-06 2023-06-06 Method and device for storing large-data-volume remote sensing image data

Country Status (1)

Country Link
CN (1) CN116644202A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200210421A1 (en) * 2018-12-29 2020-07-02 Wuhan University Method of storing remote sensing big data in hbase database
CN113778341A (en) * 2021-09-17 2021-12-10 北京航天泰坦科技股份有限公司 Distributed storage method and device for remote sensing data and remote sensing data reading method
CN114338718A (en) * 2021-12-21 2022-04-12 浙江大学 Distributed storage method, device and medium for massive remote sensing data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200210421A1 (en) * 2018-12-29 2020-07-02 Wuhan University Method of storing remote sensing big data in hbase database
CN113778341A (en) * 2021-09-17 2021-12-10 北京航天泰坦科技股份有限公司 Distributed storage method and device for remote sensing data and remote sensing data reading method
CN114338718A (en) * 2021-12-21 2022-04-12 浙江大学 Distributed storage method, device and medium for massive remote sensing data

Similar Documents

Publication Publication Date Title
US10002148B2 (en) Memory-aware joins based in a database cluster
US7809769B2 (en) Database partitioning by virtual partitions
CA2676593C (en) Scalable secondary storage systems and methods
US8543596B1 (en) Assigning blocks of a file of a distributed file system to processing units of a parallel database management system
US7689764B1 (en) Network routing of data based on content thereof
US20080306990A1 (en) System for performing a redistribute transparently in a multi-node system
US11095715B2 (en) Assigning storage responsibility in a distributed data storage system with replication
US20090198657A1 (en) Coordination server, data allocating method, and computer program product
CN106201771A (en) Data-storage system and data read-write method
CN111917834A (en) Data synchronization method and device, storage medium and computer equipment
US20110153677A1 (en) Apparatus and method for managing index information of high-dimensional data
US20170270149A1 (en) Database systems with re-ordered replicas and methods of accessing and backing up databases
Grandi et al. Frame-sliced partitioned parallel signature files
US20080288563A1 (en) Allocation and redistribution of data among storage devices
US20220308925A1 (en) Scale out deduplicated file system as microservices
US7668846B1 (en) Data reconstruction from shared update log
CN103744882A (en) Catalogue fragment expressing method and device based on key value pair
CN111309260B (en) Data storage node selection method
CN116644202A (en) Method and device for storing large-data-volume remote sensing image data
CN117321583A (en) Storage engine for hybrid data processing
CN115114294A (en) Self-adaption method and device of database storage mode and computer equipment
Xiao A Spark based computing framework for spatial data
Klein et al. Dxram: A persistent in-memory storage for billions of small objects
CN113032447A (en) Data distributed storage method and distributed data storage system
US20230376461A1 (en) Supporting multiple fingerprint formats for data file segment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240223

Address after: Room 105, 1st Floor, Building 5, No. 8 Dongbei Wangxi Road, Haidian District, Beijing, 100193

Applicant after: Yizhirui Information Technology Co.,Ltd.

Country or region after: China

Address before: 601, Unit 6, 3rd Floor, No. 25 Shangdi East Road, Haidian District, Beijing, 100089

Applicant before: Beijing Jietai Yunji Information Technology Co.,Ltd.

Country or region before: China