CN103902660B - System and method for prefetching file layout through readdir++ in cluster file system - Google Patents
System and method for prefetching file layout through readdir++ in cluster file system Download PDFInfo
- Publication number
- CN103902660B CN103902660B CN201410076739.0A CN201410076739A CN103902660B CN 103902660 B CN103902660 B CN 103902660B CN 201410076739 A CN201410076739 A CN 201410076739A CN 103902660 B CN103902660 B CN 103902660B
- Authority
- CN
- China
- Prior art keywords
- file
- client
- directory
- catalogue
- submodule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
- G06F16/1824—Distributed file systems implemented using Network-attached Storage [NAS] architecture
- G06F16/183—Provision of network file services by network file servers, e.g. by using NFS, CIFS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
- H04L67/5681—Pre-fetching or pre-delivering data based on network characteristics
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention discloses a system and method for prefetching a file layout through readdir++ in a cluster file system. The system comprises a client side module (1) and a server module (2). The client side module (1) is used for obtaining or returning catalogue reading authorization from the server module (2); after the catalogue reading authentication is obtained, catalogue reading requests are sent to the server module (2); a webpage which is sent by the server module (2) and contains the file layout is stored in a local cache, and when the client side module (1) reads files in a catalogue, the file layout stored in the local cache is used directly. The server module (2) is used for authorizing the catalogue reading authorization to the client side module (1) or recalling the catalogue reading authorization from the client side module (1); when the catalogue reading requests are received, metadata information including the file layout is encapsulated in the webpage, and the webpage is sent to the client side module (1). Therefore, network interaction overheads for acquiring the file layout in the file reading process can be reduced, and the reading access performance of the massive small files can be improved greatly.
Description
Technical field
The present invention relates to metadata prefetches mechanism in cluster file system, pass through in more particularly to a kind of cluster file system
Readdir++ prefetches the system and method for file layout.
Background technology
With the arriving in big data epoch, global metadata information content rapidly increases.In ecommerce, social networks, science
There is increasing undersized file in the fields such as calculating.Therefore, efficiently " mass small documents " are managed, there is provided low delay
Small documents access service, be new problem of the pendulum in face of distributed file system.
In recent years, metadata and data, services isolating construction have become the main trend of distributed file system.This point
It is advantageous in that from structure:Client adopts out-band method DASD, for the access of big file can be obtained
Take higher access performance.
But, the situation that small documents are accessed is then entirely different.For small documents, the ratio shared by data access is few, first number
It is big according to the ratio for accessing shared.And during client access file data, be required for passing through network interaction first(RPC)It is synchronous to obtain
File layout(layout)Can just carry out afterwards, cause single small file operation to postpone excessive.It is particularly same in continuous read-only access
During large amount of small documents under one catalogue, client needs continually individually to carry out a hyposynchronous file layout to each small documents
Metadata is obtained and accessed, and this causes very big impact for systematic function.
Read catalogue(readdir)It is the operation that catalogue is read in file system, it is therefore an objective to obtain all directory entries in catalogue
(entry)Title(name), type(type), inode number(ino)Etc. essential information.
Catalogue licensing scheme(DELEGATION)It is a kind of recallable guarantee that server gives to client.Giving
Authorize to recall authorize during, server ensure that operation of other clients to the catalogue is not resulted in file system one
The semantic conflict of cause property.Its essence is exactly that server gives client in processing locality reading catalogue, lookup (lookup), opening
(open), close(close), read(read), write(write)Ability, without with server interaction.If without catalogue
Licensing scheme, then operating to be required to be interacted with server above just can complete, and its time overhead is very big.
Current parallel network file system(pNFS)Using readdir+ technologies, the technology is awarded in reading catalogue and catalogue
Once improvement on power manufacturing basis.Except obtaining title, type, index node extra, whole mesh under catalogue are also additionally prefetched
The file handle of record item(fh)And file attribute(fattr), this two it is critical that metadata information.Afterwards, if client
End needs the metadata information for accessing file under the catalogue, please without the network interaction information for sending acquisition attribute and obtain handle
Ask, these metadata informations are directly obtained from local cache, reduce the network interaction carried out with meta data server.
But, in the distributed file system based on block interface, the unit that the follow-up read access operations of client need
Data message is not only file attribute and file handle, in addition it is also necessary to the corresponding book physical block number in Documents Logical position, i.e.,
File layout.The readdir+ technologies that pNFS is used obviously cannot effectively reduce the network interaction information of file layout acquisition and open
Pin.
The content of the invention
In order to solve the above problems, it is an object of the present invention to provide passing through readdir++ in a kind of cluster file system
The system and method for file layout are prefetched, the network interaction expense that file layout is obtained when reading file can be reduced, can be with
Significantly lift the read access performance of mass small documents.
The readdir++ technologies of the present invention are, based on the once improvement of readdir+ technologies, when catalogue is read, not only to prefetch
File attribute, file handle etc., and file layout is prefetched, can avoid obtaining the primary network interactive operation of file layout.
For achieving the above object, the present invention proposes to prefetch file layout by readdir++ in a kind of cluster file system
System, for prefetching the metadata information including the mass small documents including file layout, so as to quick the little text of magnanimity is read
Part, it is characterised in that the system includes:
Client modules(1), for from server module(2)Obtain or give back reading catalogue mandate;When the acquisition reading catalogue
After mandate, to the server module(2)Send and read catalog request;By the server module(2)Send containing file layout
The page is stored in local cache, the client modules(1)When reading the file under the catalogue, directly using depositing in local cache
The file layout of this document of storage;
The server module(2), the metadata information of the small documents that are stored with, for the client modules(1)Authorize or
Recall the reading catalogue mandate;When the reading catalog request is received, will also encapsulate including the metadata information including file layout
In directory entry, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules(1).
The system that file layout is prefetched by readdir++ in the cluster file system of the present invention, it is characterised in that the visitor
Family end module(1)Specifically include:
Client sends network interaction submodule(11), for sending network interaction information to the server module;
Client receives network interaction submodule(12), for receiving the network interaction information that the server module sends;
Customer terminal webpage cache sub-module(13), for depositing the page that the server module sends;
The cache sub-module of client directory cache item and index node(14), for storing directory cache entry and index section
Point;
Client parses directory entry submodule(15), in the customer terminal webpage cache sub-module(13)Middle traversal is all
The page, and parse metadata information;
Client is submitted to and reads directory information submodule(16), for submitting to directory information is read;
Client operation behavior triggers submodule(17), read catalogue, lookup, open and shutoff operation for triggering;
Client directory authorisation process submodule(18), obtain catalogue mandate, award catalogue to be recalled for checking whether
Power is extractd from the catalogue chained list for having authorized;
Client file layout management module(19), it is for increasing and decreasing file layout reference count, file layout is right with it
Existing file layout merges in the index node answered, and according to file layout, in reading data disk in corresponding data
Hold.
The system that file layout is prefetched by readdir++ in the cluster file system of the present invention, it is characterised in that the clothes
Business device module(2)Specifically include:
Received server-side network interaction submodule(21), believe for receiving the network interaction sent by the client modules
Breath request;
Server end obtains file layout submodule(22), for obtaining file layout and this document layout being compiled
Code;
Server end sends network interaction submodule(23), for according to the network interaction information type for receiving, to visitor
Family end module makes corresponding response;
Server end catalogue mandate submodule(24), it is used for into authorizing for column catalogue mandate or recalling for catalogue mandate.
The system that file layout is prefetched by readdir++ in the group of planes file of the present invention, it is characterised in that in client
Page cache submodule(13)In include multiple pages, the organizational form of the page is:
Storing directory item in order in each page, the page using page index number as indexing foundation, in each page
Directory entry number, if the remaining space of current page is not enough to deposit next directory entry, using a new page
Face, until all directory entries are deposited in page-in in the catalogue, is come at the ending of each directory entry using ending mark
Record last directory entry that whether directory entry is this page and be whether last directory entry of this catalogue.
The system that file layout is prefetched by readdir++ in the cluster file system of the present invention, it is characterised in that
The directory entry includes inode number, title, file handle, file attribute, file layout and ending mark;Its
In, the inode number reads directory information for submitting to;The title, for building Directory caching item;This document handle and file
Attribute, for building the index node.
A kind of method for prefetching the system of file layout in cluster file system as above by readdir++, its
It is characterised by, the method is comprised the following steps:
Step 1, client modules obtain the reading catalogue mandate of file directory from server module;
Directory operation is read in step 2, client modules triggering, and whole directory entries in the catalogue are obtained from server module
Information, including inode number, title, file handle, file attribute, file layout;
Step 3, client modules triggering file open operation, and judge whether the Directory caching item of local this document is deposited
It is the cache sub-module of client directory cache item and index node(14)In;If there is step 4 is then entered, if do not deposited
Then entering step 5;
The operation of the Directory caching item of this document is searched in step 4, client modules triggering, is obtained during the search operation
Directory caching item and its file layout are taken, and the reference count to this document layout increases 1, then according to the file for having obtained
Layout, reads data content, into step 6 from data disk with the file layout correspondence position;
Step 5, client modules parse corresponding directory entry in page cache, and using metadata information structure therein
Directory caching item is built, and the reference count to file layout increases 1, then according to the file layout for having obtained, from data magnetic
Data content is read in disk with the file layout correspondence position;
Step 6, client modules triggering file close operation, the reference count to this document layout subtracts 1;
Step 7, client modules check whether the reference count of the file layout of the All Files in the catalogue is all 0,
If being not all 0, illustrate that file is not turned off, needed to wait for its closing;If 0, then reading can be given back to server module
Catalogue mandate.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed
In the step 1 is further comprising the steps:
Step 11, client sends network interaction submodule(11)Network interaction information, Shen are sent to the server module
Please the catalogue reading catalogue mandate;
Step 12, received server-side network interaction submodule(21)Network interaction information is received, judges the request for mesh
Record authorization requests;
Step 13, server end catalogue mandate submodule(24)The request is processed, the result whether authorized is given;
Step 14, server end sends network interaction submodule(23)Send result to client;
Step 15, client receives network interaction submodule(12)Whether receive that the server module sends authorizes the mesh
The result of the reading catalogue mandate of record;If authorized, client directory authorisation process submodule is notified(18), into step
16;If there is no mandate, resumes step 11;
Step 16, client directory authorisation process submodule(18)By the catalogue record in acquired reading catalogue mandate chain
In table.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed
In the step 2 is further comprising the steps:
Step 21, client operation behavior triggering submodule(17)Directory operation, customer terminal webpage caching submodule are read in triggering
Block(13)The page that each directory entry in catalogue is located is found in local page caching;If do not found, into step
22;If it is found, then entering step 27;
Step 22, client sends network interaction submodule(11)Network interaction information, Shen are sent to the server module
Please read directory information;
Step 23, received server-side network interaction submodule(21)Receive network interaction information;
Step 24, server end obtains file layout submodule(22)Obtain the file layout of assigned catalogue item;
Step 25, server end sends network interaction submodule(23)To seal including the metadata information including file layout
In loaded on directory entry, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules;
Step 26, client receives network interaction submodule(12)The page that server end is sent is received, client is transferred to
End page face cache sub-module(13)Preserve;
Step 27, client parsing directory entry submodule(15)Parse directory entry, name one by one in the page of local cache
Claim to be used for building Directory caching item, file handle and file attribute are used for index building node, by the new Directory caching item for building
With the cache sub-module that index node transfers to client directory cache item and index node(14)Preserve;
Step 28, client file layout management module(19)The file cloth that will deposit in file layout and index node
Office merges;
Step 29, client is submitted to and reads directory information submodule(16)By title, inode number, type, reading is submitted to
Directory operation, mark was done the mark position 1 of reading directory operation in parent directory index node.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed
In the step 3 is further comprising the steps:
Step 31, client operation behavior triggering submodule(17)Triggering file open operation;
Step 32, client directory authorisation process submodule(18)Check whether that catalogue mandate is read in acquisition, if it is, then
Into step 33;If there is no, into step 34;
Step 33, client is submitted to and reads directory information submodule(16)Check whether the flag bit is 1;If 1, represent
Reading directory operation was done, then into step 35;If 0, then it represents that do not did reading directory operation, then into step
34;
Step 34, client sends network interaction submodule(11)Network interaction information is sent, according to without catalogue mandate
Flow process completes opening operation;
The cache sub-module of step 35, client directory cache item and index node(14)Check file to be opened slow
Corresponding Directory caching item whether there is in depositing;If it is present into step 4;If it does not exist, then into step 5.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed
In the step 4 is further comprising the steps:
Step 41, client operation behavior triggering submodule(17)The operation of the Directory caching item of this document is searched in triggering;
The cache sub-module of step 42, client directory cache item and index node(14)Directly the catalogue in caching is delayed
Credit balance returns to opening operation;
Step 43, client file layout management module(19)File layout reference count is increased into 1, then basis is obtained
The file layout from data disk with the file layout correspondence position read data content.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed
In the step 5 is further comprising the steps:
Step 51, client parsing directory entry submodule(15)Traversal finds corresponding directory entry in page cache,
Fit like a glove with the title of file to be opened for the title in directory entry with condition, if do not found, into step 52;Such as
Fruit is found, then into step 53;
Step 52, client operation behavior triggering submodule(17)Error message is assigned to into directory entry, into step 56;
Step 53, client parsing directory entry submodule(15)Other full details of corresponding directory entry are parsed, including
File handle, file attribute, file layout;File handle and file attribute are used for index building node, index node and catalogue
Cache entry is associated;
The cache sub-module of step 54, client directory cache item and index node(14)Storing directory cache entry and index
Node;
Step 55, by Directory caching item opening operation is returned to.
Step 56, client file layout management module(19)The file that will deposit in file layout and the index node
Layout merges, and file layout reference count increases 1, then according to the file layout that obtained from data disk with the text
Part layout correspondence position reads data content;
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed
In the step 6 is further comprising the steps:
Step 61, client operation behavior triggering submodule(17)Triggering file close operation;
Step 62, client file layout management module(19)Reference count to this document layout subtracts 1.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed
In the step 7 is further comprising the steps:
Step 71, server end catalogue mandate submodule(24)Catalogue mandate is recalled in decision;
Step 72, server end sends network interaction submodule(23)Network interaction information is sent to the client modules,
Notify its release catalogue mandate;
Step 73, client receives network interaction submodule(12)Receive catalogue mandate recall notice;
Step 74, client file layout management module(19)The file layout for checking All Files in the catalogue quotes meter
Whether number is all 0;If being all 0, into step 75;If being not all 0, wait;
Step 75, client directory authorisation process submodule(18)By catalogue to be recalled from the catalogue chained list for having authorized
Extract;
Step 76, customer terminal webpage cache sub-module(13)With client directory cache item and the caching submodule of index node
Block(14)Local cache is removed, file layout is eliminated simultaneously;
Step 77, client sends network interaction submodule(11)Network interaction information is sent, notifies that server end is returned
Also read catalogue mandate.
The positive effect of the present invention is:
The system and method for file layout are prefetched by readdir++ in cluster file system proposed by the invention
In, client modules complete to read after directory operation, reduce follow-up reading in file content operation and obtain file layout
Network interaction expense, and while other operations(Including opening, close, search and read again catalogue etc.)Also need not be by first number
According to access, directly can locally complete in client, metadata network interaction expense is saved completely.Survey through contrast properties
Examination, this kind of method can take cost in mass small documents applied environment by minimum client-cache, exchange for significantly
Read access performance boost.
Description of the drawings
Fig. 1 is the structural representation of the system that file layout is prefetched by readdir++ of the present invention;
Fig. 2 is the structural representation of the organizational form of the page cache of the present invention;
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, below in conjunction with accompanying drawing to the present invention's
The system and method for prefetching file layout by readdir++ in cluster file system are further elaborated.Should
Understand, specific embodiment described herein only to explain the present invention, is not intended to limit the present invention.
First, the system to prefetching file layout by readdir++ in the cluster file system of the present invention is illustrated,
The system architecture is as shown in Figure 1.
The system includes client modules 1 and server module 2, wherein, documentary first number is deposited in the server module 2
It is believed that breath.In above-mentioned module, further, client modules include following 9 submodules:
Client sends network interaction submodule 11:
Network interaction information is sent for client modules to server module(RPC).The field being related in the present invention
Jing You:Application is read catalogue mandate, request and reads directory information, opening operation, gives back reading catalogue mandate.
Client receives network interaction submodule 12:
For the network interaction information that client modules the reception server module sends.The scene being related in the present invention
Have:Application is read catalogue mandate, request and reads directory information.The return information that wherein catalogue mandate is read in application is integer variable 1 or 0,
To indicate whether to be authorized;It is the page that the return information of directory information is read in request, and client modules are directly stored in it
In local page caching.
Customer terminal webpage cache sub-module 13:
Client local cache is made up of two parts:A part is related in customer terminal webpage cache sub-module 13
Page cache, the inside is used to deposit the page;And another part is the cache sub-module of client directory cache item and index node
The Directory caching item being related in 14 and the caching of index node, the inside is only used for storing directory cache entry and index node.
The organizational form of page cache is as shown in Figure 2:
Each page the inside storing directory item in order, the page is using page index number as the foundation for indexing.Due to title
Length is different, file attribute is not quite similar comprising attribute type, therefore catalogue item size, without fixed numbers, this is resulted in
The directory entry number that can be accommodated in each page.If the remaining space of current page is not enough to deposit next catalogue
, then using a new page, until all directory entries are deposited in page-in in the catalogue.The ending of each directory entry
Place is using ending mark(eof)It is to record last directory entry and the directory entry that whether the directory entry is this page
It is no for last directory entry in this catalogue.
If it should be strongly noted that client obtains catalogue mandate, can ensure that by reading what directory operation was obtained
The page is effective.Once catalogue mandate is called back, the information of page cache is no longer valid, it is therefore desirable to remove page cache.
The cache sub-module 14 of client directory cache item and index node:
When catalogue is read, the title for prefetching(name)For building Directory caching item(dentry), file handle(fh)With
File attribute(fattr)For index building node(inode), file layout(layout)With existing file cloth in index node
Office merges.Directory caching item and index node are associated after operation, then can be deposited Directory caching item and index node
In the buffer.
Have always with the caching of index node it should be noted that obtaining catalogue mandate and might not represent Directory caching item
Effect, because do not use also resulting in the caching that Directory caching item and index node are removed in client timing for a long time.In catalogue
In the case of cache entry is effective, search(lookup)Operation directly returns correspondence Directory caching item;And in Directory caching item catalogue
In the case that item is invalid, then the information in effective page cache can be utilized, Directory caching item and index node are set up again.
Both the above situation need not carry out the network interaction with meta data server.
Client parsing directory entry submodule 15:
Module travels through all pages in page cache, in each page, all directory entries is traveled through one by one, and therefrom solves
Separate out full content, i.e. inode number, title, file handle, file attribute, file layout etc..The applied field of the submodule
Scape has at two, respectively:Read catalogue(readdir)And lookup(lookup).
Client is submitted to and reads directory information submodule 16:
The module is the core of readdir+ original flow processs, client by the title of all directory entries under catalogue, type,
The essential informations such as inode number are returned, so as to reach the purpose for reading catalogue.Client is tieed up in the index node of parent directory
Hold flag bit plusplus_done to indicate whether to do reading directory operation.
Client operation behavior triggers submodule 17:
In the present invention, the operation that client is related to has reading catalogue(readdir), search(lookup), open
(open)And close(close)Operation.
Client directory authorisation process submodule 18:
One important foundation of the present invention is that client is obtained and reads catalogue mandate, so just can guarantee that in local cache
The validity of information, also can just be such that operation completely locally carries out, and save RPC interactions.Client in local maintenance two
The catalogue chained list of catalogue mandate is obtained, respectively reads catalogue mandate chained list and read-write catalogue mandate chained list.Award when catalogue is got
Temporary, by the direct insertion to corresponding chained list end;When catalogue mandate is reclaimed, it is deleted from corresponding chained list.
Client file layout management module 19:
The module major responsibility is the increase and decrease of file layout reference count, by file layout and index node existing file cloth
Office merges, and according to file layout, reads corresponding data content in data disk.File layout is mainly by following several parts
Constitute:Starting, length, type, pattern, numbering.
In above-mentioned server module, further, server end module includes following 4 submodules:
Received server-side network interaction submodule 21:
The network interaction information request that received server-side is sent by client, judges request type.
Server end obtains file layout submodule 22:
The function of the module is as the mode of file layout is obtained in the readdir+ technologies introduced in background technology.
Server end sends network interaction submodule 23:
Server provides response according to the network interaction information type for receiving.If request is readdir++, server
The file layout that end will get, together with information such as file handle, file attribute, titles, in being encapsulated in directory entry.Again by catalogue
Item is encapsulated in order in the page, and by network interaction information client is sent to.If request is catalogue mandate, being given is
The result of no mandate.
Server end catalogue mandate submodule 24:
Server end authorizes client directory mandate, is that client can keep file layout in caching effectively basic.
The module is mainly made up of two parts function:A part is authorizing for catalogue mandate, and another part is recalling for catalogue mandate.Clothes
Business device end is the catalogue mandate chained list that each directory maintenance one has been authorized, represents which client obtains awarding for the catalogue
Power.It should be noted that synchronization only has 1 or 0 client holds the read-write catalogue mandate to certain catalogue, and can have
Some clients hold reading catalogue mandate to certain catalogue.
Below the method for the system to prefetching file layout by readdir++ in the cluster file system of the present invention is carried out
Explanation.
The method is comprised the following steps:
Step 1, client modules obtain the reading catalogue mandate of file directory from server module;
Directory operation is read in step 2, client modules triggering, and whole directory entries in the catalogue are obtained from server module
Information, including inode number, title, file handle, file attribute, file layout;
Step 3, client modules triggering file open operation, and judge whether the Directory caching item of local this document is deposited
Be client directory cache item with the cache sub-module 14 of index node in;If there is then enter step 4, if there is no
Then enter step 5;
The operation of the Directory caching item of this document is searched in step 4, client modules triggering, is obtained during the search operation
Directory caching item and its file layout are taken, and the reference count to this document layout increases 1, then according to the file for having obtained
Layout, reads data content, into step 6 from data disk with the file layout correspondence position;
Step 5, client modules parse corresponding directory entry in page cache, and using metadata information structure therein
Directory caching item is built, and the reference count to file layout increases 1, then according to the file layout for having obtained, from data magnetic
Data content is read in disk with the file layout correspondence position;
Step 6, client modules triggering file close operation, the reference count to this document layout subtracts 1;
Step 7, client modules check whether the reference count of the file layout of the All Files in the catalogue is all 0,
If being not all 0, illustrate that file is not turned off, needed to wait for its closing;If 0, then reading can be given back to server module
Catalogue mandate.
Wherein, the step 1 is further comprising the steps:
Step 11, client sends network interaction submodule 11 and sends network interaction information, application to the server module
The reading catalogue mandate of the catalogue;
Step 12, received server-side network interaction submodule 21 receives network interaction information, judges the request for catalogue
Authorization requests;
Step 13, server end catalogue mandate submodule 24 processes the request, provides the result whether authorized;
Step 14, server end sends network interaction submodule 23 and sends result to client;
Whether step 15, what client received that network interaction submodule 12 receives that the server module sends authorizes the catalogue
Reading catalogue mandate result;If authorized, client directory authorisation process submodule 18 is notified, into step 16;
If there is no mandate, resumes step 11;
Step 16, client directory authorisation process submodule 18 is by the catalogue record in acquired reading catalogue mandate chained list
In.
The step 2 is further comprising the steps:
Directory operation, customer terminal webpage cache sub-module are read in step 21, the client operation behavior triggering triggering of submodule 17
13 find the page that each directory entry in catalogue is located in local page caching;If do not found, into step 22;Such as
Fruit is found, then into step 27;
Step 22, client sends network interaction submodule 11 and sends network interaction information, application to the server module
Read directory information;
Step 23, received server-side network interaction submodule 21 receives network interaction information;
Step 24, server end obtains the file layout that file layout submodule 22 obtains assigned catalogue item;
Step 25, server end sends network interaction submodule 23 and will encapsulate including the metadata information including file layout
In directory entry, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules;
Step 26, client receives network interaction submodule 12 and receives the page that server end is sent, and transfers to client
Page cache submodule 13 is preserved;
Step 27, client parsing directory entry submodule 15 parses one by one directory entry, title in the page of local cache
For building Directory caching item, file handle and file attribute are used for index building node, by the new Directory caching item for building and
Index node transfers to client directory cache item to preserve with the cache sub-module 14 of index node;
Step 28, the file layout that client file layout management module 19 will have been deposited in file layout and index node
Merge;
Step 29, client is submitted to and reads directory information submodule 16 by title, inode number, type, submits to reading mesh
Record operation, mark was done the flag bit of reading directory operation in parent directory index node(plusplus_done)Put 1.
The step 3 is further comprising the steps:
Step 31, the client operation behavior triggering triggering file open operation of submodule 17;
Step 32, client directory authorisation process submodule 18 checks whether that catalogue mandate is read in acquisition, if it is, then entering
Enter step 33;If there is no, into step 34;
Step 33, client submits to reading directory information submodule 16 to check whether the flag bit is 1;If 1, represent
Jing did reading directory operation, then into step 35;If 0, then it represents that do not did reading directory operation, then into step 34;
Step 34, client sends network interaction submodule 11 and sends network interaction information, according to the stream without catalogue mandate
Journey completes opening operation;
Step 35, client directory cache item checks file to be opened in caching with the cache sub-module 14 of index node
In corresponding Directory caching item whether there is;If it is present into step 4;If it does not exist, then into step 5.
The step 4 is further comprising the steps:
The operation of the Directory caching item of this document is searched in step 41, the client operation behavior triggering triggering of submodule 17;
Step 42, client directory cache item is direct with the cache sub-module 14 of index node by the Directory caching in caching
Item returns to opening operation;
File layout reference count is increased 1 by step 43, client file layout management module 19, and then basis has been obtained
The file layout reads data content from data disk with the file layout correspondence position.
The step 5 is further comprising the steps:
Step 51, the traversal in page cache of client parsing directory entry submodule 15 finds corresponding directory entry, matching
Condition is that the title in directory entry fits like a glove with the title of file to be opened, if do not found, into step 52;If
Find, then into step 53;
Error message is assigned to directory entry by step 52, client operation behavior triggering submodule 17, into step 56;
Step 53, client parsing directory entry submodule 15 parses other full details of corresponding directory entry, including text
Part handle, file attribute, file layout;File handle and file attribute are used for index building node, and index node delays with catalogue
Credit balance is associated;
Step 54, client directory cache item is saved with the storing directory cache entry of cache sub-module 14 and index of index node
Point;
Step 55, by Directory caching item opening operation is returned to.
Step 56, the file cloth that client file layout management module 19 will have been deposited in file layout and the index node
Office merges, and file layout reference count increases 1, then according to the file layout that obtained from data disk with the file
Layout correspondence position reads data content;
The step 6 is further comprising the steps:
Step 61, the client operation behavior triggering triggering file close operation of submodule 17;
Step 62, reference count of the client file layout management module 19 to this document layout subtracts 1.
The step 7 is further comprising the steps:
Step 71, server end catalogue mandate submodule 24 determines to recall catalogue mandate;
Step 72, server end sends network interaction submodule 23 and sends network interaction information to the client modules, leads to
Know its release catalogue mandate;
Step 73, client receives network interaction submodule 12 and receives catalogue mandate recall notice;
Step 74, client file layout management module 19 checks the file layout reference count of All Files in the catalogue
Whether 0 is all;If being all 0, into step 75;If being not all 0, wait;
Step 75, client directory authorisation process submodule 18 plucks catalogue to be recalled from the catalogue chained list for having authorized
Remove;
The cache sub-module of step 76, customer terminal webpage cache sub-module 13 and client directory cache item and index node
14 remove local cache, and file layout is eliminated simultaneously;
Step 77, client sends network interaction submodule 11 and sends network interaction information, notifies that server end is given back
Read catalogue mandate.
Claims (13)
1. the system for prefetching file layout by readdir++ in a kind of cluster file system, for quick the little text of magnanimity is read
Part, it is characterised in that the system includes:
Client modules(1), for from server module(2)Obtain or give back reading catalogue mandate;When the acquisition reading catalogue mandate
Afterwards, to the server module(2)Send and read catalog request;By the server module(2)The page containing file layout for sending
In being stored in local cache, the client modules(1)When reading the file under the catalogue, directly using storage in local cache
The file layout of this document;
The server module(2), the metadata information of the small documents that are stored with, for the client modules(1)Authorize or recall
The reading catalogue mandate;When the reading catalog request is received, also mesh will be packaged in including the metadata information including file layout
In record item, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules(1).
2. the system for prefetching file layout by readdir++ in cluster file system as claimed in claim 1, its feature exists
In the client modules(1)Specifically include:
Client sends network interaction submodule(11), for sending network interaction information to the server module;
Client receives network interaction submodule(12), for receiving the network interaction information that the server module sends;
Customer terminal webpage cache sub-module(13), for depositing the page that the server module sends;
The cache sub-module of client directory cache item and index node(14), for storing directory cache entry and index node;
Client parses directory entry submodule(15), in the customer terminal webpage cache sub-module(13)It is middle to travel through all pages
Face, and parse metadata information;
Client is submitted to and reads directory information submodule(16), for submitting to directory information is read;
Client operation behavior triggers submodule(17), read catalogue, lookup, open and shutoff operation for triggering;
Client directory authorisation process submodule(18), for check whether obtain catalogue mandate, will catalogue mandate be recalled from
Extract in the catalogue chained list for having authorized;
Client file layout management module(19), it is for increasing and decreasing file layout reference count, file layout is corresponding
Existing file layout merges in index node, and according to file layout, reads corresponding data content in data disk.
3. the system for prefetching file layout by readdir++ in cluster file system as claimed in claim 1, its feature exists
In the server module(2)Specifically include:
Received server-side network interaction submodule(21), please for receiving the network interaction information sent by the client modules
Ask;
Server end obtains file layout submodule(22), for obtaining file layout and this document layout being encoded;
Server end sends network interaction submodule(23), for according to the network interaction information type for receiving, to client
Module makes corresponding response;
Server end catalogue mandate submodule(24), it is used for into authorizing for column catalogue mandate or recalling for catalogue mandate.
4. the system for prefetching file layout by readdir++ in cluster file system as claimed in claim 2, its feature exists
In in customer terminal webpage cache sub-module(13)In include multiple pages, the organizational form of page cache is:
The interior storing directory item in order of each page, the page is using page index number as index foundation, the catalogue in each page
Number, if the remaining space of current page is not enough to deposit next directory entry, using a new page, directly
All directory entries are deposited in page-in into the catalogue, and this is recorded using ending mark at the ending of each directory entry
Directory entry be whether last directory entry of this page and be whether this catalogue last directory entry.
5. the system for prefetching file layout by readdir++ in the cluster file system as described in claim 1 or 4, it is special
Levy and be,
The directory entry includes inode number, title, file handle, file attribute, file layout and ending mark;Wherein,
The inode number, for submitting to directory information is read;The title, for building Directory caching item;This document handle and file belong to
Property, for building the index node.
6. the system of file layout is prefetched by readdir++ in a kind of cluster file system as described in claim 1-5
Method, it is characterised in that the method is comprised the following steps:
Step 1, client modules obtain the reading catalogue mandate of file directory from server module;
Directory operation is read in step 2, client modules triggering, and the letter of the whole directory entries in the catalogue is obtained from server module
Breath, including inode number, title, file handle, file attribute, file layout;
Step 3, client modules triggering file open operation, and judge local this document Directory caching item whether there is in
The cache sub-module of client directory cache item and index node(14)In;If there is step 4 is then entered, if there is no then
Into step 5;
The operation of the Directory caching item of this document is searched in step 4, client modules triggering, and during the search operation mesh is obtained
Record cache entry and its file layout, and the reference count to this document layout increases 1, then according to the file layout for having obtained
Data content is read with the file layout correspondence position from data disk, into step 6;
Step 5, client modules parse corresponding directory entry in page cache, and build mesh using metadata information therein
Record cache entry, and 1 is increased to the reference count of file layout, then according to the file layout for having obtained from data disk with
The file layout correspondence position reads data content;
Step 6, client modules triggering file close operation, the reference count to this document layout subtracts 1;
Step 7, client modules check whether the reference count of the file layout of the All Files in the catalogue is all 0, if
0 is not all, then explanation has file to be not turned off, needs to wait for its closing;If 0, then reading catalogue can be given back to server module
Authorize.
7. the method for prefetching the system of file layout in cluster file system as claimed in claim 6 by readdir++, its
It is characterised by, the step 1 is further comprising the steps:
Step 11, client sends network interaction submodule(11)Network interaction information is sent to the server module, application should
The reading catalogue mandate of catalogue;
Step 12, received server-side network interaction submodule(21)Network interaction information is received, judges that the request is awarded for catalogue
Power request;
Step 13, server end catalogue mandate submodule(24)The request is processed, the result whether authorized is given;
Step 14, server end sends network interaction submodule(23)Send result to client;
Step 15, client receives network interaction submodule(12)Whether receive that the server module sends authorizes the catalogue
Read the result of catalogue mandate;If authorized, client directory authorisation process submodule is notified(18), into step 16;
If there is no mandate, resumes step 11;
Step 16, client directory authorisation process submodule(18)By the catalogue record in acquired reading catalogue mandate chained list
In.
8. the method for prefetching the system of file layout in cluster file system as claimed in claim 6 by readdir++, its
It is characterised by, the step 2 is further comprising the steps:
Step 21, client operation behavior triggering submodule(17)Directory operation, customer terminal webpage cache sub-module are read in triggering
(13)The page that each directory entry in catalogue is located is found in local page caching;If do not found, into step 22;
If it is found, then entering step 27;
Step 22, client sends network interaction submodule(11)Network interaction information is sent to the server module, application is read
Directory information;
Step 23, received server-side network interaction submodule(21)Receive network interaction information;
Step 24, server end obtains file layout submodule(22)Obtain the file layout of assigned catalogue item;
Step 25, server end sends network interaction submodule(23)To be packaged in including the metadata information including file layout
In directory entry, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules;
Step 26, client receives network interaction submodule(12)The page that server end is sent is received, client's end page is transferred to
Face cache sub-module(13)Preserve;
Step 27, client parsing directory entry submodule(15)Parse directory entry one by one in the page of local cache, title is used
To build Directory caching item, file handle and file attribute are used for index building node, by the new Directory caching item for building and rope
Draw the cache sub-module that node transfers to client directory cache item and index node(14)Preserve;
Step 28, client file layout management module(19)The file layout deposited in file layout and index node is closed
And;
Step 29, client is submitted to and reads directory information submodule(16)By title, inode number, type, reading catalogue is submitted to
Operation, mark was done the mark position 1 of reading directory operation in parent directory index node.
9. the method for prefetching the system of file layout in cluster file system as claimed in claim 6 by readdir++, its
It is characterised by, the step 3 is further comprising the steps:
Step 31, client operation behavior triggering submodule(17)Triggering file open operation;
Step 32, client directory authorisation process submodule(18)Check whether that catalogue mandate is read in acquisition, if it is, then entering
Step 33;If there is no, into step 34;
Step 33, client is submitted to and reads directory information submodule(16)Check whether the flag bit is 1;If 1, represent
Reading directory operation was done, then into step 35;If 0, then it represents that do not did reading directory operation, then into step 34;
Step 34, client sends network interaction submodule(11)Network interaction information is sent, according to the flow process without catalogue mandate
Complete opening operation;
The cache sub-module of step 35, client directory cache item and index node(14)Check file to be opened in the buffer
Corresponding Directory caching item whether there is;If it is present into step 4;If it does not exist, then into step 5.
10. the method for prefetching the system of file layout in cluster file system as claimed in claim 6 by readdir++,
Characterized in that, the step 4 is further comprising the steps:
Step 41, client operation behavior triggering submodule(17)The operation of the Directory caching item of this document is searched in triggering;
The cache sub-module of step 42, client directory cache item and index node(14)Directly by the Directory caching item in caching
Return to opening operation;
Step 43, client file layout management module(19)File layout reference count is increased into 1, then according to the institute for having obtained
State file layout and read data content with the file layout correspondence position from data disk.
The method for prefetching the system of file layout in 11. cluster file systems as claimed in claim 6 by readdir++,
Characterized in that, the step 5 is further comprising the steps:
Step 51, client parsing directory entry submodule(15)Traversal finds corresponding directory entry in page cache, matches bar
Part is that the title in directory entry fits like a glove with the title of file to be opened, if do not found, into step 52;If looked for
Arrive, then into step 53;
Step 52, client operation behavior triggering submodule(17)Error message is assigned to into directory entry, into step 56;
Step 53, client parsing directory entry submodule(15)Parse other full details of corresponding directory entry, including file
Handle, file attribute, file layout;File handle and file attribute are used for index building node, index node and Directory caching
Item is associated;
The cache sub-module of step 54, client directory cache item and index node(14)Storing directory cache entry and index section
Point;
Step 55, by Directory caching item opening operation is returned to.
Step 56, client file layout management module(19)The file layout that will deposit in file layout and the index node
Merge, file layout reference count increases 1, then according to the file layout that obtained from data disk with the file cloth
Office's correspondence position reads data content.
The method for prefetching the system of file layout in 12. cluster file systems as claimed in claim 6 by readdir++,
Characterized in that, the step 6 is further comprising the steps:
Step 61, client operation behavior triggering submodule(17)Triggering file close operation;
Step 62, client file layout management module(19)Reference count to this document layout subtracts 1.
The method for prefetching the system of file layout in 13. cluster file systems as claimed in claim 6 by readdir++,
Characterized in that, the step 7 is further comprising the steps:
Step 71, server end catalogue mandate submodule(24)Catalogue mandate is recalled in decision;
Step 72, server end sends network interaction submodule(23)Network interaction information is sent to the client modules, is notified
Its release catalogue mandate;
Step 73, client receives network interaction submodule(12)Receive catalogue mandate recall notice;
Step 74, client file layout management module(19)Checking the file layout reference count of All Files in the catalogue is
It is no to be all 0;If being all 0, into step 75;If being not all 0, wait;
Step 75, client directory authorisation process submodule(18)Catalogue to be recalled is plucked from the catalogue chained list for having authorized
Remove;
Step 76, customer terminal webpage cache sub-module(13)With client directory cache item and the cache sub-module of index node
(14)Local cache is removed, file layout is eliminated simultaneously;
Step 77, client sends network interaction submodule(11)Network interaction information is sent, notifies that server end has given back reading
Catalogue mandate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410076739.0A CN103902660B (en) | 2014-03-04 | 2014-03-04 | System and method for prefetching file layout through readdir++ in cluster file system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410076739.0A CN103902660B (en) | 2014-03-04 | 2014-03-04 | System and method for prefetching file layout through readdir++ in cluster file system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103902660A CN103902660A (en) | 2014-07-02 |
CN103902660B true CN103902660B (en) | 2017-04-12 |
Family
ID=50993982
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410076739.0A Expired - Fee Related CN103902660B (en) | 2014-03-04 | 2014-03-04 | System and method for prefetching file layout through readdir++ in cluster file system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103902660B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104933144B (en) * | 2015-06-19 | 2018-03-30 | 中国科学院计算技术研究所 | Ensure the system and method for data validity in a kind of parallel network file system |
CN105095353B (en) * | 2015-06-19 | 2018-12-04 | 中国科学院计算技术研究所 | The equal system and method that does after small documents is pre-read in a kind of parallel network file system |
CN105119955B (en) * | 2015-07-09 | 2018-10-09 | 中国科学院计算技术研究所 | The method and system that catalogue multipage is supported are read in a kind of distributed file system |
CN109947719B (en) * | 2019-03-21 | 2022-10-11 | 昆山九华电子设备厂 | Method for improving efficiency of cluster reading directory entries under directory |
CN112286897B (en) * | 2020-10-10 | 2023-01-10 | 苏州浪潮智能科技有限公司 | Method for communication between PNFS server and client |
CN113485639B (en) * | 2021-06-18 | 2024-02-20 | 济南浪潮数据技术有限公司 | IO speed optimization method, system, terminal and storage medium for distributed storage |
CN113608694B (en) * | 2021-07-27 | 2024-03-19 | 北京达佳互联信息技术有限公司 | Data migration method, information processing method, device, server and medium |
CN114003562B (en) * | 2021-12-29 | 2022-03-22 | 苏州浪潮智能科技有限公司 | Directory traversal method, device and equipment and readable storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103179185A (en) * | 2012-12-25 | 2013-06-26 | 中国科学院计算技术研究所 | Method and system for creating files in cache of distributed file system client |
-
2014
- 2014-03-04 CN CN201410076739.0A patent/CN103902660B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103179185A (en) * | 2012-12-25 | 2013-06-26 | 中国科学院计算技术研究所 | Method and system for creating files in cache of distributed file system client |
Non-Patent Citations (2)
Title |
---|
Research on Implement Snapshot of pNFS Distributed File System;Liu-Chao等;《Applied Mathematics & Information Sciences》;20110331;第5卷(第2期);全文 * |
并行网络文件系统数据管理技术的研究与实现;冯振乾;《中国优秀硕士学位论文全文数据库信息科技辑》;20090715;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN103902660A (en) | 2014-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103902660B (en) | System and method for prefetching file layout through readdir++ in cluster file system | |
US11625501B2 (en) | Masking sensitive information in records of filtered accesses to unstructured data | |
CN104243425B (en) | A kind of method, apparatus and system carrying out Content Management in content distributing network | |
CN106886375B (en) | The method and apparatus of storing data | |
US8447827B2 (en) | Providing local access to managed content | |
US9396290B2 (en) | Hybrid data management system and method for managing large, varying datasets | |
US9491104B2 (en) | System and method for storing/caching, searching for, and accessing data | |
CN103020315B (en) | A kind of mass small documents storage means based on master-salve distributed file system | |
US10108653B2 (en) | Concurrent reads and inserts into a data structure without latching or waiting by readers | |
CN104794190B (en) | The method and apparatus that a kind of big data effectively stores | |
CN108021717B (en) | Method for implementing lightweight embedded file system | |
Serres et al. | Ultrastructural morphometry of the human sperm flagellum with a stereological analysis of the lengths of the dense fibres | |
US9262511B2 (en) | System and method for indexing streams containing unstructured text data | |
CN111046041B (en) | Data processing method and device, storage medium and processor | |
CN114490527B (en) | Metadata retrieval method, system, terminal and storage medium | |
US10664508B1 (en) | Server-side filtering of unstructured data items at object storage services | |
CN108647266A (en) | A kind of isomeric data is quickly distributed storage, exchange method | |
CN102984256A (en) | Processing method and system for metadata based on authorization manner | |
Changtong | An improved HDFS for small file | |
CN104021137A (en) | Method and system for opening and closing file locally through client side based on catalogue authorization | |
CN105915619A (en) | Access heat regarded cyber space information service high performance memory caching method | |
CN103136294B (en) | File operating method and device | |
CN112650711A (en) | Massive small file storage method based on Redis and HDFS | |
CN110020373A (en) | The method and apparatus that static page is stored, browsed | |
CN103646034A (en) | Web search engine system and search method based content credibility |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170412 |