CN113204520B - Remote sensing data rapid concurrent read-write method based on distributed file system - Google Patents

Remote sensing data rapid concurrent read-write method based on distributed file system Download PDF

Info

Publication number
CN113204520B
CN113204520B CN202110469599.3A CN202110469599A CN113204520B CN 113204520 B CN113204520 B CN 113204520B CN 202110469599 A CN202110469599 A CN 202110469599A CN 113204520 B CN113204520 B CN 113204520B
Authority
CN
China
Prior art keywords
data
file
file system
hdfs
writing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110469599.3A
Other languages
Chinese (zh)
Other versions
CN113204520A (en
Inventor
段延松
张祖勋
陶鹏杰
柯涛
张永军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202110469599.3A priority Critical patent/CN113204520B/en
Publication of CN113204520A publication Critical patent/CN113204520A/en
Application granted granted Critical
Publication of CN113204520B publication Critical patent/CN113204520B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/122File system administration, e.g. details of archiving or snapshots using management policies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a remote sensing data rapid concurrent read-write method based on a distributed file system, wherein the bottom physical structure inherits the characteristics of an HDFS file system, the method comprises the steps of installing a Hadoop system on each data server in a computer group, establishing the HDFS file system, and then dividing a part of space on each data server to be used as a physical storage space of the own file system; performing primary packaging on an HDFS service processing layer, taking over access of an operating system to a file system, and when the operating system only requires to read a file and file data already exists, directly referencing an HDFS file system interface, and finishing reading of the file data by the HDFS; when the operating system requires that the access to the file comprises file writing operation, the file operation is completely taken over, the data reading and writing are realized by the own file system, and the data are synchronized into the HDFS after the data reading and writing are finished; the self-contained file system only reads and writes one server. The invention can realize the rapid concurrent reading and writing of mass remote sensing data.

Description

Remote sensing data rapid concurrent read-write method based on distributed file system
Technical Field
The invention relates to the fields of computer application technology and remote sensing big data processing, in particular to a concurrent read-write technology of big data files, and particularly relates to a rapid concurrent read-write technology for realizing remote sensing data based on a distributed file system.
Background
With the rapid development of remote sensing technology, remote sensing satellites transmitted by various countries are more and more, satellite data received every day reaches a PB level, and the processing of massive remote sensing data puts higher requirements on storage technology and processing speed. On the other hand, with the rapid development of computer, especially internet technology, data storage technology has also gained a qualitative leap, especially with the recent appearance of cloud technology, and data storage technology has been improved to an unprecedented level. The cloud technology is usually based on a distributed file system, so that high reliability is guaranteed, and capacity and cost performance are improved. Among the Distributed File systems, HDFS (Hadoop Distributed File System) Distributed File systems that provide open source code are particularly popular. HDFS is one of the cores of Hadoop projects, and is the basis for distributed data storage. The HDFS is designed based on the requirement of accessing the ultra-Large file in a streaming Data mode, can run on a general server with high cost performance, has the characteristics of high fault tolerance, high reliability, high expandability, high availability, high throughput rate and the like, and brings great convenience for processing an ultra-Large Data Set (Large Data Set).
However, HDFS also has the following 3 disadvantages (HDFS is not applicable in these cases): 1. low latency data access, such as read and write data on the order of milliseconds, cannot be achieved. HDFS is only suitable for high throughput scenarios, i.e. writing a large amount of data at a time, but does not support fast reading back of data at once. 2. Large amounts of small file storage are not supported. Storing a large number of small files will occupy a large amount of memories of the index service (NameNode) to store the index information of the data block, however, the memory of the index service of the HDFS is limited, and cannot realize mass expansion, and in addition, a large number of indexes will cause the seek time to exceed the reading time, thereby greatly reducing the access efficiency. 3. Files cannot be written concurrently or modified randomly. The files of the HDFS have to be exclusively accessed, multiple threads are not allowed to write simultaneously, an additional (appendix) mode is only supported, and random modification is not supported. However, in the remote sensing big data processing, data rewriting and updating are a processing mode which must be supported, the remote sensing image is usually very big, and the remote sensing image cannot be processed again all at a time, but only the position needing to be modified is modified, and the characteristic is determined by an algorithm of professional processing of the remote sensing image.
Therefore, the HDFS including the above 3 disadvantages cannot be applied to remote sensing big data processing, and therefore the invention provides a method for improving a distributed file system to realize rapid concurrent reading and writing of remote sensing big data.
Disclosure of Invention
In order to solve the 3 defects of the HDFS, the invention provides a method for improving a distributed file system by modifying a native HDFS distributed file system.
In order to achieve the purpose, the technical scheme of the invention is as follows:
the invention provides a remote sensing data rapid concurrent read-write method based on a distributed file system, wherein the bottom physical structure inherits the characteristics of an HDFS file system, the method comprises the steps of installing a Hadoop system on each data server in a computer group, establishing the HDFS file system, and then dividing a part of space on each data server to be used as a physical storage space of the own file system; performing primary packaging on an HDFS service processing layer, taking over access of an operating system to a file system, and when the operating system only requires to read a file and file data already exists, directly referencing an HDFS file system interface, and finishing reading of the file data by the HDFS;
when the operating system requires that the access to the file comprises a file writing operation, the file writing operation is completely taken over, the data reading and writing are realized by the own file system, and after the data reading and writing are finished, a file writing interface of the HDFS file system is quoted to synchronize the data into the HDFS; the self-contained file system only reads and writes one server.
Moreover, a plurality of universal data servers are deployed in the computer group to realize data storage and scientific calculation; after hardware installation is completed, installing Hadoop systems on all data servers, establishing an HDFS file system, and completing construction of an HDFS native storage cluster; meanwhile, each data server is divided into a part of space as the physical storage space of the own file system.
Moreover, the own file system reads and writes data in a RAID mode.
And moreover, one index server is deployed to realize the management of the whole distributed file system.
Moreover, when the self-owned file system is scheduled, the data server with the best read-write performance is always selected to perform data read-write service each time, and the standard implementation of the consideration with the best performance is as follows,
providing an indicator for each data server to report the data read-write condition in real time, wherein the value calculation formula of the indicator is as follows,
I=(I max –(D/T)/(1-(C-S)))×(1-W)
in the formula, wherein I isDevice value, I max The maximum data throughput provided for the hard disk group of the server is D, the latest data read-write quantity, T, the time for reading and writing the D data, C, the current time, S, the starting statistical time and W, wherein the D is the latest data read-write quantity, the T is the time for reading and writing the D data, the S is the starting statistical time, and the W is the CPU utilization rate of the storage server.
Moreover, after the index server selects the data server with the best performance, subsequent file reading and writing are realized by the corresponding data server, and any data cannot be circulated by the index server in the data reading and writing process, so that the bottleneck of data reading and writing is ensured not to be formed.
Moreover, the process of providing read-write data by the data server includes two cases,
firstly, brand new data is written, only simple file reading and writing needs to be provided by the self-owned file system, after the file reading and writing are finished, the self-owned file system refers to the file writing function of the HDFS system, the file is synchronized into the HDFS file system, and a file reading service is provided for the outside;
the second one is file rewriting, which is used to modify the existing file, firstly, immediately applying for the storage space with the same size as the file to be rewritten in the own file system, and partitioning the file according to the corresponding partitioning consistency of the HDFS file system, and marking the file as 0; then, marking the file in the HDFS as invalid, reading a data block of the HDFS, updating the data and writing the data into a self-owned file system; and finally, synchronizing the self file into the HDFS file system.
Moreover, in the case of a large number of small files, the file name in each folder is managed using the HBase database, and the file contents are managed using a large file including a fixed record size.
The invention is based on an open source HDFS system, and aims at 3 defects of the HDFS system, a set of channel read-write technology (English Gate IO) is designed, a distributed file system is used for providing services to the outside, and the defect of concurrent and rapid read-write of files is overcome through the self-owned file system. Meanwhile, in order to realize load balance of concurrent reading and writing, a scheduling strategy of concurrent reading and writing is designed in an own file system, and the file system is ensured to have better reading and writing performance. Finally, in order to support the mass of small files, the invention fully utilizes the superior management capability of the HBase database to manage the file name in each folder by the HBase database, and the file content is managed by adopting a large file with a fixed record size, thereby finally realizing the rapid concurrent reading and writing of mass data.
The scheme of the invention is simple and convenient to implement, has strong practicability, solves the problems of low practicability and inconvenient practical application of the related technology, can improve the user experience, and has important market value.
Drawings
FIG. 1 is a block diagram of a logical structure according to an embodiment of the present invention;
FIG. 2 is a process flow diagram of an embodiment of the invention;
FIG. 3 is a diagram illustrating a physical storage structure of each server according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an entire distributed file system according to an embodiment of the present invention;
fig. 5 is a schematic view of a read-write scheduling process of an own file system according to an embodiment of the present invention.
Detailed Description
The technical solution of the present invention is specifically described below with reference to the accompanying drawings and examples.
The invention provides a method for improving a distributed file system to realize rapid concurrent reading and writing of remote sensing data, which is named as a channel reading and writing technology (English Gate IO) and can realize rapid concurrent reading and writing of mass data. In order to solve the 3 defects of the HDFS, the embodiment of the present invention modifies the native HDFS distributed file system:
firstly, the characteristics of high fault tolerance, high reliability, high expandability, high availability, high throughput and the like of the HDFS file system are inherited on a bottom physical structure, and the high efficiency and stability of the distributed file system are ensured.
Then, the first-level packaging is carried out on the HDFS service processing layer, and the access of the operating system to the file system is taken over. If the operating system only requires to read the file and the file data already exists, the HDFS file system interface is directly referred to, and the HDFS finishes reading the file data. If the operating system requires the access to the file to include the file writing operation (including creating and rewriting), the file operation is completely taken over, the built file system realizes the data reading and writing, and after the data reading and writing are completed, the writing file interface of the HDFS file system is quoted to synchronize the data into the HDFS.
The logical structure composition and the processing flow of the embodiment of the invention are shown in fig. 1 and fig. 2. The embodiment of the invention provides a file read-write service provided by the Gate IO to the outside, wherein the native HDFS file system realizes distributed data organization, and the file write is provided by the file system. The process can be designed as follows:
judging whether the file needs to be written or not according to the file reading and writing request;
if so, reading and writing data by the own file system, then finishing reading the existing data or writing the data, and synchronizing the data with the HDFS when finishing reading the existing data or writing the data;
and if not, the native HDFS reads and writes data.
In order to realize the rapid concurrent reading and writing of data, the invention needs to establish a computer group, namely a plurality of universal data servers are deployed to realize data storage and scientific calculation. The number of hard disks installed in each data server is more than 4, and 8 blocks are recommended. And after finishing hardware installation, installing Hadoop systems on all the data servers, establishing an HDFS file system, and finishing the establishment of an HDFS native storage cluster. After that, a part of the space is divided on each data server as the physical storage space of the own file system. In order to ensure the read-write performance of the self-owned file system, the physical space of the self-owned file system must make full use of the characteristics of a disk array (RAID) formed by a plurality of hard Disks, that is, the self-owned file system must read and write data in a RAID mode. The RAID attribute is set according to the number of hard disks of each data server, and a RAID5 mode, that is, a mode in which the number of redundancies is 1, is recommended. The storage division diagram of each data server is shown in fig. 3, wherein each data server includes a part of the physical storage space used by the native HDFS file system as the physical storage space of its own file system.
After the storage space organization of a single data server is completed, the present invention is similar to the HDFS, and an index server (i.e., nameNode server) needs to be deployed to manage the entire distributed file system, and the composition of the entire distributed file system is shown in fig. 4. In specific implementation, a standby index server NameNode can be set. Unlike native HDFS, the index server deployed by the present invention provides two types of services,
one is a read file service, which directly refers to the read file of the native HDFS and is not described in detail here.
The second is a service for writing files (including creating files and rewriting files), which is performed by the own file system.
The file system is different from the HDFS, files are distributed and stored in each data server according to a predefined rule, and only one data server is read and written. In order to support the concurrent reading and writing of multiple computers and multiple files, a self-owned file system needs to provide a set of scheduling algorithm to complete the load balancing of the concurrent reading and writing data of the multiple computers. The general idea of the own file system scheduling algorithm is as follows: and selecting the server with the best read-write performance for data read-write service each time. The best consideration standard of performance provides an indicator for each storage server, reports the data reading and writing condition of the indicator in real time, the maximum value of the indicator is 100, if no data reading and writing application exists recently and a spare magnetic disk physical space exists, the indicator is the maximum 100, no space is directly provided for 0, if data reading and writing exist, the value of the indicator is calculated according to the data flow of a period of time, and the statistical algorithm is recommended as follows:
I=(I max –(D/T)/(1-(C-S)))×(1-W)
wherein I is an indicator value, I max The maximum data throughput provided by the hard disk group of the data server, D is the latest data read-write quantity, T is the time for reading and writing the data of D, C is the current time, S is the starting statistical time, and W is the CPU utilization rate of the storage server.
In the process of selecting the read-write data server, the situation that the data server applying for reading and writing provides data storage service needs to be considered. In this case, if the local indicator value is not 0, the local storage service is preferentially selected.
After the index server selects a specific data storage server, subsequent file reading and writing are realized by the data server, and any data cannot be circulated by the index server in the data reading and writing process, so that the aim of ensuring that the bottleneck of data reading and writing is not formed is fulfilled.
The specific data storage server provides specific read-write data process including two cases,
one of the methods is to write new data, which is simple, and the file system of the system only needs to provide simple file reading and writing. After the file reading and writing are finished, the file writing function of the HDFS system is required to be quoted by the own file system, the file is synchronized into the HDFS system, and a file reading service is provided for the outside.
And secondly, rewriting the file, namely modifying the existing file. The processing of this situation is relatively complex, the processing flow is shown in fig. 5, and the main working steps are as follows:
(1) Firstly, immediately applying for a storage space with the same size as a file to be rewritten in a self-owned file system, and partitioning the space, wherein the size of the partition is consistent with that of the partition of the file to be rewritten in an HDFS (Hadoop distributed File System);
(2) All blocks are marked as 0, which represents that the data block does not exist;
(3) And marking the modified file as invalid in the HDFS file to ensure that the HDFS does not accept the access to the file any more, and if file access applications exist, directly handing over the file system.
(4) Calculating a modified data block according to the rewritten file address, reading the data into a memory by referring to a file reading interface of the HDFS, updating a modified part, writing the data into a block corresponding to a self-owned file system, and simultaneously modifying the identifier to be 1;
(5) After all the modifications are finished, similar to the new files, the file writing function of the HDFS system is introduced, and the files are synchronized into the HDFS file system.
(6) And deleting the original files marked as invalid in the HDFS file, and releasing the storage space.
In addition, in order to solve the problem of a large number of small files, the embodiment of the invention adopts an HBase database and a small file special space provided by a Hadoop system to solve the problem. Aiming at the problem that a native HDFS system does not support the management of massive file names, an HBase database is introduced to manage the file names. HBase provides an index and optimization method based on payment strings, theoretically supports infinite data entry management, and tests prove that HBase has better time than 2 seconds for searching character strings of billions of records. With the support of such powerful databases, massive file name retrieval has not been a problem. For the storage of the data content of the small files, the invention carries out processing by establishing a large file mode, in particular to store a plurality of small files into one large file. When the system is installed, a user is allowed to set a capacity limit (a recommended default value is 64 KB) of the small files to the system, and all files with the file size smaller than 64KB are considered to be small files in the running process of the system. In the file system provided by the invention, all files are created only by one channel, namely, the files are created in the file system and then synchronized into the HDFS. When small files are encountered in the synchronization process, the system does not carry out synchronization, but uniformly writes the contents of the small files into one large file. It is particularly reminded that the large file storing the small files stores data in a fixed record size manner, wherein the size of each record is the small file size threshold (e.g. 64K), and the actual data smaller than the threshold also occupies the same size space. When small data is stored in a large file, the corresponding data entry of the file name database is required to record the actual storage large file name and the file offset of the data, so that the actual content of the file can be conveniently found when the file is read later. The organization mode can effectively solve the writing and reading of massive small files, but is not favorable for deleting the small files, so that the invention has to add extra work to solve the problem, but the comprehensive effect is still superior to the prior art. Generally, if there is a small file deletion request, the index server will directly operate on the HBase database to remove the file name record, but not delete the record, but move the record to another table called a deleted file. When the file list is deleted to a certain extent (for example, 10 ten thousand records), the small file content arrangement work is started. The small file arrangement does not need to be redesigned and developed, the essence of the operation is to rewrite the file, and directly execute the file rewriting function described above on the large file storing the content of the small file.
In summary, the present invention is based on the open-source HDFS system, and a set of channel read-write technologies is designed for overcoming 3 disadvantages of the HDFS system, so that the fast concurrent read-write of mass data can be finally realized.
It will be apparent to those skilled in the art that the steps of the present invention described above may be implemented by a general purpose computer, by designing, developing or directly arranging executable program code to exist on a single computer or to be distributed over a network of multiple computing servers, and in some cases may differ from the order and execution of the steps listed herein or be separately fabricated into single or multiple independent modules, and thus the present invention is not limited to any particular combination of hardware and software.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (7)

1. A remote sensing data rapid concurrent read-write method based on a distributed file system is characterized in that: the method comprises the steps that the characteristics of an HDFS file system are inherited on a bottom physical structure, a Hadoop system is installed on each data server in a computer cluster, the HDFS file system is established, and then a part of space is divided on each data server to serve as a physical storage space of the own file system; performing primary packaging on an HDFS service processing layer, taking over access of an operating system to a file system, and when the operating system only requires to read a file and file data already exists, directly referencing an HDFS file system interface, and finishing reading of the file data by the HDFS;
when the operating system requires that the access to the file comprises a file writing operation, the file writing operation is completely taken over, the data reading and writing are realized by the own file system, and after the data reading and writing are finished, a file writing interface of the HDFS file system is quoted to synchronize the data into the HDFS; the self-owned file system only reads and writes one server; the data reading and writing are realized by the self-owned file system in a mode of reading and writing data in an RAID mode, a storage space with the same size as a file to be rewritten is applied in the self-owned file system, and the data is partitioned according to the corresponding partition consistency of the HDFS file system and is marked as 0; then, marking the file in the HDFS as invalid, reading a data block of the HDFS, updating the data and writing the data into a self-owned file system; and finally, synchronizing the self file into the HDFS file system.
2. The remote sensing data rapid concurrent reading and writing method based on the distributed file system according to claim 1, characterized in that: a plurality of universal data servers are deployed in the computer group to realize data storage and scientific calculation; after hardware installation is completed, installing Hadoop systems on all data servers, establishing an HDFS file system, and completing construction of an HDFS native storage cluster; meanwhile, each data server is divided into a part of space as the physical storage space of the own file system.
3. The remote sensing data rapid concurrent read-write method based on the distributed file system according to claim 2, characterized in that: and deploying an index server to realize the management of the whole distributed file system.
4. The remote sensing data rapid concurrent reading and writing method based on the distributed file system according to claim 3, characterized in that: the data server with the best read-write performance is always selected to perform data read-write service each time when the self-owned file system is scheduled, the best consideration standard implementation mode is as follows,
providing an indicator for each data server to report the data read-write condition in real time, wherein the value calculation formula of the indicator is as follows,
I=(I max –(D/T)/(1-(C-S)))×(1-W)
wherein I is an indicator value, I max The maximum data throughput provided for the hard disk group of the server, D is the latest data read-write quantity, T is the time for reading and writing the data of D, C is the current time, S is the starting statistical time, and W is the CPU utilization rate of the storage server.
5. The remote sensing data rapid concurrent reading and writing method based on the distributed file system according to claim 4, characterized in that: after the index server selects the data server with the best performance, subsequent file reading and writing are realized by the corresponding data server, and any data cannot be circulated by the index server in the data reading and writing process, so that the bottleneck of data reading and writing is ensured not to be formed.
6. The method for rapidly and concurrently reading and writing the remote sensing data based on the distributed file system according to claim 1, 2, 3, 4 or 5, wherein: the process of providing read and write data by the data server includes two cases,
one is writing brand new data, the self-contained file system only needs to provide simple file reading and writing, after the file reading and writing are finished, the self-contained file system refers to the file writing function of the HDFS system, synchronizes the file into the HDFS file system, and provides file reading service for the outside;
secondly, rewriting the file, which is used for modifying the existing file, firstly immediately applying for a storage space with the same size as the file to be rewritten in an own file system, and partitioning the file according to the corresponding partitioning consistency of the HDFS file system, wherein the storage space is marked as 0; then, marking the file in the HDFS as invalid, reading a data block of the HDFS, updating the data and writing the data into a self-owned file system; and finally, synchronizing the self file into the HDFS file system.
7. The method for rapidly and concurrently reading and writing the remote sensing data based on the distributed file system according to claim 1, 2, 3, 4 or 5, wherein: for the case of a large number of small files, the file name in each folder is managed by using the HBase database, and the file content is managed by using a large file containing a fixed record size.
CN202110469599.3A 2021-04-28 2021-04-28 Remote sensing data rapid concurrent read-write method based on distributed file system Active CN113204520B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110469599.3A CN113204520B (en) 2021-04-28 2021-04-28 Remote sensing data rapid concurrent read-write method based on distributed file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110469599.3A CN113204520B (en) 2021-04-28 2021-04-28 Remote sensing data rapid concurrent read-write method based on distributed file system

Publications (2)

Publication Number Publication Date
CN113204520A CN113204520A (en) 2021-08-03
CN113204520B true CN113204520B (en) 2023-04-07

Family

ID=77027104

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110469599.3A Active CN113204520B (en) 2021-04-28 2021-04-28 Remote sensing data rapid concurrent read-write method based on distributed file system

Country Status (1)

Country Link
CN (1) CN113204520B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117376344A (en) * 2023-12-08 2024-01-09 荣耀终端有限公司 Data transmission method, electronic device, and computer-readable storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183839A (en) * 2015-09-02 2015-12-23 华中科技大学 Hadoop-based storage optimizing method for small file hierachical indexing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544045A (en) * 2013-10-16 2014-01-29 南京大学镇江高新技术研究院 HDFS-based virtual machine image storage system and construction method thereof
US10353923B2 (en) * 2014-04-24 2019-07-16 Ebay Inc. Hadoop OLAP engine
CN106027638B (en) * 2016-05-18 2019-04-12 华中科技大学 A kind of hadoop data distributing method based on hybrid coding
CN106250473B (en) * 2016-07-29 2019-11-12 江苏物联网研究发展中心 Remote sensing image cloud storage method
WO2019071595A1 (en) * 2017-10-13 2019-04-18 华为技术有限公司 Method and device for storing data in distributed block storage system, and computer readable storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183839A (en) * 2015-09-02 2015-12-23 华中科技大学 Hadoop-based storage optimizing method for small file hierachical indexing

Also Published As

Publication number Publication date
CN113204520A (en) 2021-08-03

Similar Documents

Publication Publication Date Title
US20210056074A1 (en) File System Data Access Method and File System
US10725976B2 (en) Fast recovery using self-describing replica files in a distributed storage system
US8850127B2 (en) Managing concurrent accesses to a cache
US20140149702A1 (en) Cloud scale directory services
CN110109873B (en) File management method for message queue
CN112334891B (en) Centralized storage for search servers
CN114281762B (en) Log storage acceleration method, device, equipment and medium
US8612717B2 (en) Storage system
CN113204520B (en) Remote sensing data rapid concurrent read-write method based on distributed file system
US10387384B1 (en) Method and system for semantic metadata compression in a two-tier storage system using copy-on-write
US20190243807A1 (en) Replication of data in a distributed file system using an arbiter
KR100907477B1 (en) Apparatus and method for managing index of data stored in flash memory
US8082230B1 (en) System and method for mounting a file system on multiple host computers
US11429311B1 (en) Method and system for managing requests in a distributed system
US20210326271A1 (en) Stale data recovery using virtual storage metadata
US10055139B1 (en) Optimized layout in a two tier storage
CN115048046B (en) Log file system and data management method
US11467777B1 (en) Method and system for storing data in portable storage devices
US20230006814A1 (en) Method and apparatus for implementing changes to a file system that is emulated with an object storage system
US10628391B1 (en) Method and system for reducing metadata overhead in a two-tier storage architecture
CN111444114B (en) Method, device and system for processing data in nonvolatile memory
CN109918355A (en) Realize the virtual metadata mapped system and method for the NAS based on object storage service
US11966637B1 (en) Method and system for storing data in portable storage devices
US20090249007A1 (en) Method and system for accessing data using an asymmetric cache device
CN115982101B (en) Machine room data migration method and device based on multi-machine room copy placement strategy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant