CN102147802A

CN102147802A - Pseudo-random type NFS application acceleration system

Info

Publication number: CN102147802A
Application number: CN2010106117218A
Authority: CN
Inventors: 骆志军; 许建卫; 袁清波
Original assignee: Dawning Information Industry Beijing Co Ltd
Current assignee: Dawning Information Industry Beijing Co Ltd; Dawning Information Industry Co Ltd
Priority date: 2010-12-17
Filing date: 2010-12-17
Publication date: 2011-08-10
Anticipated expiration: 2030-12-17
Also published as: CN102147802B

Abstract

The invention provides a pseudo-random type network file system (NFS) application acceleration system. The pseudo-random type NFS application acceleration system comprises two kernel modules, namely prefetch network file systemdaemon (pfnfsd) and device mapper cache (dmcache); pfnfsd is a rewrite of the kernel module nfsd, mainly comprises an NFS client request positioning module and a pre-acquiring module, and is used for capturing and positioning an NFS request of a client and executing pre-acquiring operation; and dmcache is a target driver realized on the basis of a linux kernel device mapper mechanism, mainly comprises a module for managing high-speed storage equipment, mapping an address and recycling resources, and is used for storing the pre-acquired data to the high-performance storage equipment. In the pseudo-random type NFS application acceleration system, the application is transparent, the kernel changes small and the extra high-speed storage equipment can be loaded.

Description

A kind of pseudorandom class NFS uses accelerating system

Technical field

The present invention relates to storage system I/O performance optimization field, be specifically related to a kind of pseudorandom class NFS and use accelerating system.

Background technology

Network file system(NFS) (NFS:Network File System) is because its stability, ease for use and the free characteristic of increasing income often become the first-selected system that many cluster networks are stored.Though also have much other network storage products now, but its stability, ease for use can't be compared with NFS, or valuable product, are not suitable for middle and small scale cluster network system, so for most of group system, NFS still is irreplaceable.

Along with the NFS framework reach its maturity and perfect, NFS has goed deep into various applications, such as high-performance calculation, IPTV (Internet Protocol Television), sensor information processing, petroleum prospecting etc.But along with the develop rapidly of processor computing velocity, the I/O bandwidth that NFS provided has become the bottleneck of group system gradually.

At the I/O performance issue of NFS file system, various researchs and improvement have also obtained effective progress.Particularly various I/O access modules are more clocklike used, use such as IPTV, its access module often has characteristics such as succession, predictability, by extracting the access module of these application, take buffer memory, look ahead or technology such as I/O parallelization, can effectively improve the I/O performance of NFS.

Yet for the various application with random access mode, the correlative study of I/O performance that can promote nfs server is considerably less.Such as petroleum prospecting, the calculated amount of these application and data volume are very big, and data manipulation mainly is with machine-readable, but can obtain in advance with machine-readable access sequence.

Device Mapper is a kind of mapping framework mechanism from the logical device to the physical equipment that provides in Linux 2.6 kernels, and it is realized work such as I/O ask to redirect by modular target driver plug-in unit.By means of Device Mapper mechanism, we are the memory device of low speed, as hard disk and memory device at a high speed, and as ramdisk, the block device that a plurality of device map such as SSD are a logic.

Summary of the invention

At the application of above-mentioned random access mode, the present invention proposes a kind of pseudorandom class NFS and use accelerating system.

A kind of pseudorandom class NFS uses accelerating system, comprises two kernel modules of pfnfsd and dmcache;

Wherein, described pfnfsd is the rewriting to kernel module nfsd, mainly comprises NFS client-requested locating module and prefetch module, and the NFS that is used for intercepting and capturing with positioning client terminal asks, and carries out prefetch operation;

Described dmcache is based on the targetdriver that linux kernel device mapper mechanism realizes, mainly comprises management, map addresses and the resource recycling module of high speed storing equipment, and the deposit data of being responsible for looking ahead is to the high-performance memory device.

First kind of optimal technical scheme of the present invention is: the pfnfsd module is set up a concordance list for each internal memory according to each client indexes file in internal memory, safeguard a pointer, pointing to is the pairing index position of base request at last for the last time, searches concordance list according to the NFS client read request that obtains afterwards and determines the position of client current request in concordance list.

A kind of more preferably technical scheme of the present invention is: the pfnfsd module determine behind the client reading location can with the thread synchronization of looking ahead, can also determine the module that the dmcache module can be recovered.

Second kind of optimal technical scheme of the present invention is: the prefetch module of described pfnfsd, use the kernel thread of looking ahead to look ahead according to concordance list in kernel.

The third optimal technical scheme of the present invention is: also comprise the window that reorders of looking ahead, prefetch module is directly read sorted concordance list.

The 4th kind of optimal technical scheme of the present invention is: the dmcache module is disk and high speed storing device map a block device, and wherein high speed storing equipment is used for storing prefetch data, and its capacity is externally hidden, and is equivalent to high-speed cache of disk.

The 5th kind of optimal technical scheme of the present invention is: dmcache carries out map addresses and IO redirect operation at memory device and logical device, and management of cache equipment.

The 6th kind of optimal technical scheme of the present invention is: when the data order that reads when prefetch data and NFS client is the same, promptly adopt recovery algorithm recovered data block.

Another kind of the present invention more preferably technical scheme is: described recovery algorithm is that the NFS client whenever runs through the reorder data of window of an I/O, will be subjected to a data block of window.

Beneficial effect of the present invention is as follows:

The transparency: because NPRP is the kernel module that is carried in the nfs server end, do not need application, kernel to do any change to client, corresponding to having the transparency;

Change little to linux kernel: NPRP has inserted a function to the NFS kernel module, is used to intercept and capture the I/O request of NFS client, and in addition, other module all is the standalone module that can dynamically add and unload, and is very little to the change of kernel;

Can load extra high speed storing equipment: NPRP and can utilize the storage medium of host memory ramdisk, also can load extra high speed storing equipment, as MCard8 as prefetch data; Increase extra high speed storing equipment, be equivalent to expand the internal memory of I/O server, make the I/O server have stronger caching performance.

Description of drawings

Fig. 1 is that pseudorandom class NFS uses the speed technology framework

Fig. 2 is location NFS client reading location

Fig. 3 is the dmcache framework

Embodiment

Pseudorandom class NFS uses speed technology and mainly is made of pfnfsd (prefetch network file systemdaemon) and two kernel modules of dmcache (device mapper cache).Pfnfsd is the rewriting to kernel module nfsd, mainly comprises NFS client-requested locating module and prefetch module, is used for intercepting and capturing the NFS request with positioning client terminal, and carries out most principal work such as look ahead; Dmcache is based on the target driver that linux kernel Device Mapper mechanism realizes, the management, map addresses and the resource recycling module that mainly comprise high speed storing equipment, it is put into the data of looking ahead on the high-performance memory device, as ramdisk, and high property memory device SSD etc.

Among Fig. 1, the Request Processing process of nfs is actually according to information requested, in the pfnfsd module location of client is carried out in request, according to locating information look ahead, synchronously, reclaim resource; Carry out the process of the redirected processing of I/O then in the dmcache module.

1.pfnfsd module

Pfnfsd at first obtains the index file of each client, sets up a concordance list for each client in internal memory, and safeguards a pointer for each concordance list, points to the pairing index position of last actual request.

Afterwards, pfnfsd constantly obtains the read request of NFS client, according to this request and concordance list, can determine the position of client current request in concordance list.The process flow diagram of simplifying as shown in Figure 2.

As long as the read request of NFS client and certain sequence of concordance list have common factor, we can think that client read certain request of index file, suppose that it is N request that this and client have the request of common factor in concordance list, because the client read request is to get off according to the request sequential read of concordance list, then current reading location can be represented in the position of concordance list with current request, be N.

Determine the reading location of client, on the one hand, can with the thread synchronization of looking ahead, the thread that prevents to look ahead lags behind the application program of client; On the other hand, can determine which cache blocks of dmcache module can be recovered, this problem can have a detailed description in reclaiming the cache blocks module.

Pfnfsd also comprises a prefetch module, uses a kernel thread to look ahead according to concordance list at kernel, is referred to as the thread of looking ahead.Why using a thread rather than a plurality of thread, is because find through our test, to same file, single-threaded read performance often Duo simultaneously a plurality of different pieces of a file than multithreading faster, so use is single-threaded.

In order to improve the succession of disk access, carried out I/O and reordered.I/O reorders at first an ordering window, and window size asks corresponding I/O capacity to be weighed with wherein contained.Sort such as the request to every 100MB capacity of concordance list, the ordering window is 100MB.In advance concordance list is sorted according to this ordering window outside prefetch module, prefetch module is directly read this sorted concordance list and is looked ahead, or looks ahead in prefetch module ordering earlier again.Read request position that must viewing client-side when looking ahead, if then can look ahead greater than the read request position of client in the position of looking ahead, otherwise the position of looking ahead been has has been caught up with and surpassed in the read request position of client, at this moment must allow client awaits.

The data of looking ahead are not to be kept in the internal memory by working as front module, but are kept in the cache device by the dmcache module, and this equipment can be ramdisk, SSD etc.

2.dmcache module

The dmcache module is mainly carried out buffer memory to the data of looking ahead.It is embodied as the target driver of Device Mapper.

The flow process of dmcache module as shown in Figure 3.

The dmcache module is under the Linux file system layer, on the device drive layer, so the request of seeing at Fig. 3 is bio.

In Fig. 3, the dmcache module is disk and capacity high speed storing device map a block device, and wherein high speed storing equipment is used for storing the data of looking ahead, and its capacity is externally to hide, and is equivalent to a high-speed cache of disk.

Dmcache on the one hand need be the map addresses of shining upon the logical device of coming out to disk or high speed storing equipment, and this mainly is the function of map addresses and I/O redirection module; On the other hand, need management high speed storing equipment, this is the function of the administration module of cache blocks (region).

The bio request of getting off from file system layer has two classes:

A kind of is the prefetch request that thread gets off of looking ahead, if the data of this request at this moment not at high speed storing equipment, then from the hard disk read data and copy high speed storing equipment to, otherwise are directly returned from high speed storing equipment read data;

Another kind of request is the request of coming from the NFS client, if this request is then fetched data from high speed storing equipment, otherwise directly fetched data to hard disk at high speed storing equipment.

To these two kinds of different bio, all must be through the Hash inquiry, I/O is redirected could arrive real physical equipment.

It is that the cache blocks of unit is managed that high speed storing equipment is divided into cache blocks region, and region is a data structure being responsible for being mapped to from the address space of disk the address space of high speed storing equipment cache.So, each bio is searched its data that read whether in high speed storing equipment the time, only need find these data to get final product at the region of the address of disk correspondence.

For the address space from disk finds corresponding region, we have used a Hash table to satisfy this mapping.All region that filled data in magnetic disk can also not have the region of padding data then to be placed in the chained list in this Hash table.So in Fig. 3, judge that bio whether when prefetched, in fact inquires about this Hash table with its corresponding disk address.Find region to represent that then data are prefetched, and can obtain the address of these data at high speed storing equipment by this region, thus the data of obtaining.

The region that utilization is found changes target device and the equipment bias internal of bio, and in fact this be exactly that I/O is redirected.

Just as the page of operating system need reclaim, after idle region has run out, need to replace and reclaim.

Because what high speed storing equipment was preserved is the data of looking ahead, if the data that read of data of looking ahead and NFS client are the same in proper order, so after the NFS client is hit, promptly recyclable this region; But, the thread of looking ahead has carried out I/O to index file and has reordered, the order that reads order and NFS client actual read data of the data of looking ahead is different, so can not simply use the method for " just hit and reclaim ", and the algorithm of traditional LRU and so on is also improper, because reading of NFS client is discrete at random fully, the data of hitting now often all are the data that can not read once more.

We adopt the region be suitable for above-mentioned feature to reclaim algorithm: the NFS client whenever runs through the reorder data of window of an I/O, just reclaims the region of a window.This algorithm is more effective, is because the NFS client-requested sequence and the thread request sequence of looking ahead, and the beginning of reordering at each I/O and when finishing always has one " intersection point ".If NFS client read data has been crossed a window, all region of then previous window just can reclaim certainly.

Claims

1. a pseudorandom class NFS uses accelerating system, it is characterized in that: comprise two kernel modules of pfnfsd and dmcache;

Described dmcache is based on the target driver that linux kernel device mapper mechanism realizes, mainly comprises management, map addresses and the resource recycling module of high speed storing equipment, and the deposit data of being responsible for looking ahead is to the high-performance memory device.

2. a kind of according to claim 1 pseudorandom class NFS uses accelerating system, it is characterized in that: the pfnfsd module is set up a concordance list for each internal memory according to each client indexes file in internal memory, safeguard a pointer, pointing to is the pairing index position of base request at last for the last time, searches concordance list according to the NFS client read request that obtains afterwards and determines the position of client current request in concordance list.

3. use accelerating system as a kind of pseudorandom class NFS as described in the claim 2, it is characterized in that: the pfnfsd module determine behind the client reading location can with the thread synchronization of looking ahead, can also determine the module that the dmcache module can be recovered.

4. a kind of according to claim 1 pseudorandom class NFS uses accelerating system, it is characterized in that: the prefetch module of described pfnfsd, use the kernel thread of looking ahead to look ahead according to concordance list in kernel.

5. a kind of according to claim 1 pseudorandom class NFS uses accelerating system, and it is characterized in that: also comprise the window that reorders of looking ahead, prefetch module is directly read sorted concordance list.

6. a kind of according to claim 1 pseudorandom class NFS uses accelerating system, it is characterized in that: the dmcache module is disk and high speed storing device map a block device, wherein high speed storing equipment is used for storing prefetch data, and its capacity is externally hidden, and is equivalent to high-speed cache of disk.

7. a kind of according to claim 1 pseudorandom class NFS uses accelerating system, and it is characterized in that: dmcache carries out map addresses and IO redirect operation at memory device and logical device, and management of cache equipment.

8. a kind of according to claim 1 pseudorandom class NFS uses accelerating system, it is characterized in that: when the data order that reads when prefetch data and NFS client is the same, promptly adopt recovery algorithm recovered data block.

9. use accelerating system as a kind of pseudorandom class NFS as described in the claim 8, it is characterized in that: described recovery algorithm is that the NFS client whenever runs through the reorder data of window of an I/O, will be subjected to a data block of window.