CN107480150B

CN107480150B - File loading method and device

Info

Publication number: CN107480150B
Application number: CN201610399257.8A
Authority: CN
Inventors: 黄硕; 刘俊峰; 姚文辉; 朱家稷
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2016-06-07
Filing date: 2016-06-07
Publication date: 2020-12-08
Anticipated expiration: 2036-06-07
Also published as: CN107480150A

Abstract

A file loading method and device comprise the following steps: the file loading device establishes a memory mapping file of a disk file, sequentially reads data from the memory mapping file, and repeatedly sends a pre-reading notification to an operating system, wherein the operating system is notified each time to pre-read data at a specified position and a specified data amount in the disk file into a memory corresponding to the memory mapping file; and the file loading device stores the data read from the memory mapping file into a final data structure of the memory according to the storage format of the data. The file loading device comprises a data loading module, a file reading module and a memory management module. The file loading method and the file loading device can improve the loading speed of the file and can also enable the memory usage amount in the loading process to be stable and controllable.

Description

File loading method and device

Technical Field

The present invention relates to the field of computers, and in particular, to a file loading method and apparatus.

Background

In a large-scale distributed file system, in order to meet the access requirements of high concurrency and low delay of massive clients, a metadata Server (Meta Server) usually stores all metadata in a memory, and realizes persistent storage of the metadata by recording a metadata operation log (operation log) and periodically generating a metadata memory mapping file (checkpoint file). A metadata memory map file typically includes a plurality of data sections and a header. The data area records a large amount of metadata such as a directory path tree, a file table, a data copy position and the like, and compresses large blocks (for example, more than 4KB) of data in the data area so as to save space; the header records start and end information of the data area, compression parameters, and the like.

The quick loading of the metadata memory mapping file is of great significance for improving the availability and the operation and maintenance of the distributed file system. When the metadata server is restarted due to software upgrade, failure, or the like, in order to recover the metadata in the memory, it is necessary to load a recently generated memory image file and re-execute all the operation logs after the file. On one hand, the loading step of the metadata memory mapping file is that the time for upgrading and the time for recovering faults of the metadata server can be directly shortened when the loading is shortened on the key path for upgrading and recovering faults of the metadata server; on the other hand, the data volume of the memory mapping file of the metadata server in a large-scale scene can reach hundreds of GB, the time consumed by the loading step can account for more than 50% -80% of the total restarting time of the metadata server, and the influence on the upgrading time and the fault recovery time is obvious.

According to the file reading method, the existing file loading method can be divided into two types:

the first is a file interface based method: and sequentially reading the memory mapping File data by using a File reading interface (such as a read function of a C language and a Java File API), then processing the memory mapping File data, and storing the read metadata into a corresponding memory structure. In the process of calling the file reading interface by the reading thread, the file library (file library) continuously calls a reading function of the operating system to load data from the disk into the memory, and copies the data into a buffer (buffer) of a user or the library. In the process of reading the memory image file, when the compressed metadata block is encountered, the compressed metadata block can be synchronously decompressed and then processed, or the data can be copied into a temporary memory buffer and then delivered to a special thread to asynchronously decompress the data into a final metadata memory structure.

This approach suffers from at least two loading speed effects: 1. overhead implemented inside the repository: a file library (e.g., glibc file) needs to perform a plurality of status value updates and maintenance operations at each operation for the purposes of maintaining internal buffers, ensuring concurrency security, and the like. When the data amount read each time is small (for example, there are many integers of four bytes between data blocks in the metadata file), the ratio of the implementation overhead inside the library to the total overhead of one read operation will be high, reducing the effective utilization rate of the CPU, and thus reducing the data reading speed. 2. Memory copy overhead: after the operating system reads data from a disk into a kernel space (kernel space) of the system, the file interface needs to copy the data from the kernel space to a user space (user space) where a read thread is located by memory copy once. After that, the read thread can further decompress the data into the final metadata memory structure. Although memory copying is fast, it is time consuming at large data volumes of several hundred GB.

Another File loading method is a Memory Mapped File (Memory Mapped File) based method: and mapping the memory mapping file to a section of memory area of the process address space of the metadata server, sequentially accessing the memory area to obtain file data and then processing the file data. In addition, the operating system may also be informed that SEQUENTIAL reads will be performed later, facilitating system-optimized data loading (e.g., using the madvise function of Linux, in conjunction with the MADV _ sequence parameter). In the process of accessing the memory by the reading thread, when a memory address of which data is not loaded from the disk is accessed, a page fault interrupt (major page fault) which needs to access the disk is generated, and an operating system needs to synchronously wait for loading the data from the disk; meanwhile, the operating system asynchronously and continuously pre-reads (Read-ahead) the file content that has not been accessed from the disk into the Page Cache (Page Cache) of the memory, so as to avoid the Page fault interrupt caused by the subsequent access, and directly hit the Page Cache (minor Page fault).

This method also has the following two problems: 1. process memory occupancy overrun: after the memory mapping file is accessed, a memory Page (Page) containing corresponding data is recorded into a resident memory of the process, so that the used data occupies a memory of the metadata server. Before the part of memory pages are cleaned by an operating system, on one hand, the memory of the metadata server is likely to exceed a preset upper limit of use, and the metadata server is killed by a resource limiting program; on the other hand, the available memory for other processes is reduced. 2. Insufficient operating system read ahead: in the process of sequentially accessing the memory mapping file, the operating system can pre-read in advance, so that the parallelism of disk access and data processing is realized. But the problems of when to pre-read, how much data to pre-read each time, etc. are determined by the operating system itself. The operating system starts from universality, the amount of one-time pre-reading is not large, the requirement of parallel processing can not be met frequently, page fault interruption occurs, and the improvement of the throughput rate of a disk is not facilitated.

Similar problems exist for other application scenarios that require loading a disk file into a memory data structure according to a corresponding format.

Disclosure of Invention

In view of this, the present invention provides the following.

A file loading method is applied to a file loading device and comprises the following steps:

establishing a memory mapping file of a disk file, sequentially reading data from the memory mapping file, and sending a pre-reading notification to an operating system for multiple times, wherein the operating system is notified each time to pre-read data at a specified position and a specified data amount in the disk file into a memory corresponding to the memory mapping file;

and storing the data read from the memory mapping file into a final data structure of the memory according to the storage format of the data.

A file loading device comprises a data loading module, a file reading module and a memory management module, wherein:

the data loading module is used for continuously initiating a reading request to the file reading module and storing the obtained data into a final data structure of the memory according to the storage format of the data;

the file reading module is used for establishing a memory mapping file of a disk file, receiving a reading request of the data loading module, sequentially reading data from the memory mapping file, and triggering the memory management module to perform memory management;

the memory management module is configured to perform memory management, where the memory management includes: and sending a pre-reading notice to an operating system for multiple times, and informing the operating system to pre-read the data at the appointed position and the appointed data volume in the disk file to the memory corresponding to the memory mapping file every time.

The file loading method and the file loading device can make full use of the capacities of the disk and the CPU, improve the loading speed of the file, and enable the memory usage amount in the loading process to be stable and controllable.

Drawings

FIG. 1 is a flow chart of a method for loading a file according to an embodiment of the present invention;

fig. 2 is a block diagram of a file loading apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.

In this embodiment, a file loading method based on a memory mapped file is improved, and a file loading device actively manages pre-reading and notifies an operating system of a pre-reading position and a pre-reading data amount, so that pre-reading data of the operating system and data of the memory mapped file read by the file loading device are adapted to each other, thereby increasing the loading speed of the file.

As shown in fig. 1, the file loading method of this embodiment includes:

step 110, establishing a memory mapping file of a disk file, sequentially reading data from the memory mapping file, and sending a pre-reading notification to an operating system for multiple times, wherein the operating system is notified each time to pre-read data at a specified position and a specified data amount in the disk file into a memory corresponding to the memory mapping file;

in this embodiment, sending the pre-read notification to the operating system for multiple times is implemented by setting a pre-read position and comparing the pre-read position, and specifically, when the file loading device sequentially reads data from the memory mapped file, the file loading device sends the pre-read notification to the operating system once each time the data read position reaches the set pre-read position. Because the size of the data read each time is changed, the change of the data reading position is also large or small, and the data reading position reaches the set pre-reading position, and the data reading position can be equal to the set pre-reading position or exceed the set pre-reading position.

In this embodiment, the pre-reading positions are sequentially set as follows: and sending a pre-reading notification to the operating system once each time the data reading position reaches the pre-reading position set for the last time, resetting the pre-reading position before the end position of the data pre-read in the memory mapping file according to the pre-reading notification by the operating system, so as to ensure that the file loading device reads the data in the memory mapping file and avoid frequent page missing interruption. The initial value of the pre-reading position may be set to the initial data reading position or set before the initial data reading position, but for the first pre-reading before starting to read data, the operating system may be notified that the current data reading position reaches the pre-reading position and the operating system performs the pre-reading from the initial position of the memory mapped file without depending on the comparison between the above setting and the two initial positions. The sequential setting can avoid occupying too much resources, but it is also possible to set all the pre-reading positions in advance.

For convenience, the pre-reading position, the data reading position, and the ending position of the pre-read data in the memory map file may be represented by an offset value (e.g., an offset value of these positions relative to the starting position of the memory map file). There are many specific implementations for the re-set pre-read location to be related to the specific setting of the parameter before the end location of the pre-read data in the memory mapped file.

In this embodiment, the preset pre-reading position is determined according to the following formula: pa is Pb + Ps, wherein Pa is a preset pre-reading position; pb is the designated position, and is equal to the current data reading position or the last set pre-reading position; and Ps is equal to the difference obtained by subtracting the specified pre-reading advance from the specified data quantity. The preset pre-reading position can be before the end position of the data pre-read by the operating system in the memory mapping file and is spaced by the specified pre-reading advance amount.

When the current data reading position reaches the last set pre-reading position, there may be a deviation (exceeding a certain distance) between the two positions, the size of the deviation is related to the size of the data unit in the disk file, if the deviation is large (if there is a compressed block), the actual advance may be much smaller than the specified advance. Pb is equal to the current data reading position, and the pre-reading lead can be controlled more accurately, so that the method is suitable for various file formats. If the requirement of the pre-reading advance can be met, Pb can be taken as the last set pre-reading position. It should be noted here that the information specifying the data size and the specified location may be carried in each notification, or may be in a default manner, which is related to the specific implementation of the interface function. The initial read-ahead position may be defined by some default value, such as 0, or may be carried in the notification. These specific implementations are not intended to limit the invention in any way. In addition, when the operating system starts to pre-read data from a specified position, part of the data may have been pre-read, and the operating system can judge and skip the data.

And step 120, storing the data read from the memory mapping file into a final data structure of the memory according to the storage format of the data.

It should be noted that the processing described in steps 110 and 120 is not necessarily sequential, and is merely for convenience. In fact, the data read from the memory mapped file is stored to the final data structure of the memory, and the pre-reading process is parallel.

If the disk file has compressed blocks, for example, the disk file is a metadata memory mapping file of a distributed file system, the data size may be processed according to the read data size, as follows:

after reading data from the memory mapping file each time, judging whether the data volume of the read data is smaller than the minimum data volume of the compressed block of the disk file after decompression:

if so, storing the read data into a final memory data structure according to the storage format of the data;

if not, decompressing the read data in a multithreading and asynchronous mode, and then storing the decompressed data in a final memory data structure according to the storage format of the data.

In order to clean the memory occupied by the read data in time, the method of this embodiment further includes: recording the last three preset pre-reading positions and continuously updating; and each time the data reading position reaches the last set pre-reading position, before resetting the pre-reading position, informing the operating system to clean the memory area between the pre-reading positions set for the first two times in the last three sets of pre-reading positions. In this way, the user can specify the maximum memory footprint M, and the specified data size can be set to a value less than or equal to M/2+ AHEAD and greater than AHEAD, and preferably set to a value less than or equal to M/2+ AHEAD and greater than or equal to M/2. Through the setting, the memory space actually occupied by the memory mapping file in the loading process can be not more than M, wherein AHEAD is the appointed pre-reading advance. In this embodiment, the size of the designated data amount determines the step size of the pre-read position change.

As shown in fig. 2, the file loading apparatus of this embodiment includes a data loading module 10, a file reading module 20, and a memory management module 30, which may be implemented by software or hardware circuits running on corresponding hardware, where:

the data loading module 10 is configured to continuously initiate a read request to the file reading module, and store the obtained data in a final data structure of the memory according to a storage format of the data;

the file reading module 20 is configured to establish a memory mapping file of a disk file, receive a reading request of the data loading module, sequentially read data from the memory mapping file, and trigger the memory management module to perform memory management;

the memory management module 30 is configured to perform memory management, where the memory management includes: and sending a pre-reading notice to an operating system for multiple times, and informing the operating system to pre-read the data at the appointed position and the appointed data volume in the disk file to the memory corresponding to the memory mapping file every time.

Alternatively,

the memory management module sends a pre-reading notice to an operating system for multiple times, and the pre-reading notice comprises the following steps: and when the data are sequentially read from the memory mapping file, sending a pre-reading notice to the operating system each time the data reading position reaches the set pre-reading position.

Alternatively,

the memory management module successively sets the pre-reading position according to the following mode:

and sending a pre-reading notification to the operating system once each time the data reading position reaches the pre-reading position set for the last time, and resetting the pre-reading position before the end position of the data pre-read by the operating system in the memory mapping file according to the pre-reading notification.

Alternatively,

the memory management module determines the reset pre-reading position according to the following formula: pa is Pb + Ps, wherein Pa is a preset pre-reading position; pb is the designated position, and is equal to the current data reading position or the last set pre-reading position; and Ps is equal to the difference obtained by subtracting the specified pre-reading advance from the specified data quantity.

Alternatively,

the memory management module performs memory management, and further includes: recording the last three preset pre-reading positions and continuously updating; and each time the data reading position reaches the last set pre-reading position, before resetting the pre-reading position, informing the operating system to clean the memory area between the pre-reading positions set for the first two times in the last three sets of pre-reading positions.

Alternatively,

the specified data volume is less than or equal to M/2+ AHEAD and more than or equal to M/2, wherein M is the maximum memory occupation volume specified by the user, and AHEAD is the pre-reading advance volume specified by the user.

Alternatively,

the file loading device also comprises a decompression module;

after the file reading module reads data from the memory mapped file each time, the method further comprises: judging whether the data volume of the read data is smaller than the minimum data volume of the compressed block of the disk file after decompression: if yes, returning the read data to the data loading module; if not, the read data is handed to the decompression module for processing;

the decompression module is used for decompressing the received data in a multithreading and asynchronous mode and then storing the decompressed data into a final memory data structure according to the storage format of the data.

Alternatively,

the disk file loaded by the file loading device is a metadata memory mapping file of the distributed file system.

The invention is described below using an example in one application.

The example relates to a file loading device in a distributed file system, and a disk file to be loaded is a metadata memory mapping file of the distributed file system.

The file loading apparatus of this example includes a metadata loading module, a file reading module, a memory management module, and a decompression module, where:

and the metadata loading module is an initiator and a controller in the loading process and is used for continuously initiating a file reading request to the file reading module and storing the obtained metadata into a final metadata structure of the memory according to the storage format of the data so as to realize the recovery of the metadata.

And the file reading module is used for establishing a memory mapping file of the disk file, receiving a reading request of the data loading module, sequentially reading data from the memory mapping file, and triggering the memory management module to perform memory management. And the read data is returned to the metadata loading module or handed to the decompression module for processing.

And the decompression module is used for decompressing the received data in an asynchronous mode according to the data submitted by the file reading module, and then storing the decompressed data into a final memory data structure according to the storage format of the data. When the file reading module submits data, the memory address and length of the compressed data block, the expected length after decompression, and the metadata memory structure address (i.e. the destination address of decompression) provided by the metadata loading module may be provided to the decompression module. The decompression module can directly access the memory address to decompress the data into the metadata structure without additional user mode memory copy; the decompression module may contain multiple decompression threads to perform parallel processing, making full use of the multi-core CPU. Therefore, in the process of reading the memory map file data, when a compressed metadata block is encountered, the memory address and the length of the data can be directly handed to a special thread to asynchronously decompress the data into the final metadata structure of the memory, and no additional memory buffer and copy are needed.

A memory management module, configured to perform memory management, where the memory management includes: and sending a pre-reading notice to an operating system for multiple times, and informing the operating system to pre-read the data at the appointed position and the appointed data volume in the disk file to the memory corresponding to the memory mapping file every time. Specifically, the memory management record of this example records the current data reading position and the last three times of set pre-reading positions and updates them continuously, and this example refers to the last three times of set pre-reading positions in the order from the beginning to the end of the set time, respectively, as the position to be cleaned next time, the position at the time of last notification pre-reading, and the position at which the pre-reading needs to be notified next time, according to the relationship with the current data reading position. Therefore, on one hand, the operating system is guided to clean up useless memory pages, and the memory usage is controlled; and on the other hand, the operating system is guided to perform sufficient disk data pre-reading, and the disk throughput is improved.

In this example, the memory management module approximately controls, through a sliding interval policy, that the memory occupancy during the reading process does not exceed a maximum MSIZE (e.g., 64MB) specified by the user: in the process of sequentially reading the memory mapping file data, continuously sliding backwards by taking the MSIZE/2-AHEAD as a step length to form an interval with the size of 2 (MSIZE/2-AHEAD); when the data of the MSIZE/2 which is pre-read last time is close to being read (which can be judged by the specified pre-read advance AHEAD, for example, only 32KB is left), the memory interval of the older MSIZE/2-AHEAD is cleared away, the memory interval of the newer MSIZE/2-AHEAD is left for the decompression module to continue to access, and then the memory interval of the MSIZE/2-AHEAD is pre-read later. This ensures that the amount of memory used is approximately MSIZE during the entire sequential reading process; at the same time, a larger MSIZE results in higher disk throughput for pre-reading. The step size may be set to other values such as MSIZE/2.

In this example, the memory management module makes an independent decision according to the current data reading position, and the use condition of the decompression module for the memory interval does not need to be recorded, so that the judgment efficiency on the critical path is improved, and the loading speed is increased. Due to the fact that the multithreading decompression speed is high, when an older half of the memory interval is cleaned, the related decompression task is completed with high probability; even if a small number of outstanding decompression tasks later re-access the cleared memory space, the operating system's page fault interrupt mechanism ensures that the required data can be loaded again on demand.

In this example, the process of loading the metadata memory mapping file SFile by the file loading device is as follows:

step one, a file reading module opens a file header of the SFile, and acquires information such as data area position (Offset and Size), compression type, minimum actual data Size (for example, 4KB) of a compression block and the like; establishing a memory mapping file (for example, calling a linux function) corresponding to the memory area according to the position of the data area; initializing a decompression module according to the compression type;

initializing a memory management module, setting the maximum memory usage to be BUF _ SIZE (for example, 64MB), setting an initial data reading position CurrentPos as Offset according to the start position of the memory mapping file, setting a position "LastMadvise" where the pre-reading is notified last time, a position "nextmadvivise" where the pre-reading is required to be notified next time, and a position "ToDrop to be cleaned next time to be set as CurrentPos or before CurrentPos;

and step three, after the data loading module confirms that the initialization is successful, continuously sending a reading request req to the file reading module. The req may contain the amount of data read, req.size (e.g. 4, read an integer of four bytes), the destination location of read, req.dest (e.g. read 100 bytes into the final data structure address of some metadata);

step four, when the memory management module receives a request req, firstly judging whether the memory management operation needs to be implemented: if CurrentPos > is NextMadvise, the data read in advance last time is close to being read, three parts of work are needed:

(a) the memory management module informs the operating system to clear the MSIZE/2-AHEAD bytes of data starting from ToDrop (ToDrop is not required for CurrentPos, and this step is skipped). On Linux, memory cleaning is realized by using a madvise function, and an MADV _ dontened parameter value is used. The operating system will asynchronously and quickly clean the corresponding memory pages. Even if the decompression module has a very small probability of subsequent accesses to the cleared memory addresses, the page fault interrupt mechanism will ensure that the data is read from the disk again as needed.

(b) And the memory management module informs an operating system to perform data pre-reading and loads MSIZE/2 byte file data starting from CurrentPos. On Linux, pre-read notification is implemented using the madvise function, using MADV _ wired parameter values. The loading process is carried out asynchronously by the operating system, and the loaded data enters the Page Cache of the system.

(c) And sliding the position of the available memory interval backwards, and updating the state: ToDrop ═ LastMadvise ═ NextMadvise ═ CurrentPos + MSIZE/2-AHEAD. The AHEAD (for example, 32KB) is an advance amount for performing pre-reading in advance, and can ensure that the next pre-reading is started when the previous pre-read data is not completely read, so that part of new data is already read completely when the previous data is actually read completely, and the above-mentioned major page fault waiting for the disk is avoided between the two pre-readings.

After the memory management operation is finished, the file reading module has two conditions:

(a) if the req.size is larger than the minimum data size of the compressed block after decompression, the data is proved to be a block of data needing decompression, the size of the compressed block is read, a decompression task is generated together with the req.size (namely the size after decompression), the destination address and the current position, the decompression task is given to a decompression module for processing, then the CurrentPos is updated to be CurrentPos + the size of the compressed block, and finally the loading module is informed of asynchronous processing, and the loading module can continue;

(b) and if the requested data volume is smaller than the minimum data volume after the decompression of the compressed block, directly reading the required data from the current position, converting the required data into the required data type, returning the required data type to the loading module, and updating the CurrentPos + req.

And step six, the data loading module returns to the step three to continue execution until the loading is finished.

And step seven, after the data loading module reads all file data, waiting for the decompression module to complete all decompression tasks, and then informing the file loading module to release the memory mapping file (for example, calling a munmap function of Linux) and clear the environment. At this point, the metadata image file is loaded.

In a specific implementation, when the parameters such as the Offset and the Size are used in the functions such as Linux mmap and madvise, memory page alignment (page alignment) is performed. When reading data at the end of a file, the values of the above parameters may be adjusted to avoid exceeding the end of the file.

It can be seen that, in the above process of this example, by active management of pre-reading by the operating system, efficient determination of memory management operation time, and asynchronous multithread decompression, time-consuming decompression and full parallelism of the critical path of disk reading operation and file reading are enabled, so that the effective utilization rates of the disk and the CPU are improved, and the loading speed is increased. Meanwhile, the total memory usage is maintained near a specified maximum value by reasonably controlling the timing, position and number of pre-reading and memory cleaning, and memory overrun is avoided.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A file loading method is applied to a file loading device and comprises the following steps:

2. The method of claim 1, comprising:

the sending the pre-read notification to the operating system a plurality of times includes:

and when the data are sequentially read from the memory mapping file, sending a pre-reading notice to the operating system each time the data reading position reaches the set pre-reading position.

3. The method of claim 2, comprising:

the pre-reading position is set successively according to the following modes:

4. The method of claim 3, comprising:

the reset pre-read position is determined according to the following formula:

Pa＝Pb+Ps

wherein Pa is a preset pre-reading position; pb is the designated position, and is equal to the current data reading position or the last set pre-reading position; and Ps is equal to the difference obtained by subtracting the specified pre-reading advance from the specified data quantity.

5. The method of claim 3 or 4, comprising:

the method further comprises the following steps:

recording the last three preset pre-reading positions and continuously updating; and each time the data reading position reaches the last set pre-reading position, before resetting the pre-reading position, informing the operating system to clean the memory area between the pre-reading positions set for the first two times in the last three sets of pre-reading positions.

6. The method of any of claims 1-4, comprising:

7. The method of any of claims 1-4, wherein:

according to the storage format of the data, storing the data read from the memory mapping file into a final data structure of the memory, including:

8. The method of any of claims 1-4, wherein:

the disk file is a metadata memory mapping file of the distributed file system.

9. A file loading device is characterized by comprising a data loading module, a file reading module and a memory management module, wherein:

10. The apparatus of claim 9, wherein:

11. The apparatus of claim 10, wherein:

12. The apparatus of claim 11, wherein:

13. The apparatus of claim 11 or 12, wherein:

14. The apparatus of any of claims 9-12, wherein:

15. The apparatus of any of claims 9-12, wherein:

the file loading device also comprises a decompression module;

16. The apparatus of any of claims 9-12, wherein: