CN103902660B - System and method for prefetching file layout through readdir++ in cluster file system - Google Patents

System and method for prefetching file layout through readdir++ in cluster file system Download PDF

Info

Publication number
CN103902660B
CN103902660B CN201410076739.0A CN201410076739A CN103902660B CN 103902660 B CN103902660 B CN 103902660B CN 201410076739 A CN201410076739 A CN 201410076739A CN 103902660 B CN103902660 B CN 103902660B
Authority
CN
China
Prior art keywords
file
client
directory
catalogue
submodule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410076739.0A
Other languages
Chinese (zh)
Other versions
CN103902660A (en
Inventor
杨洪章
张军伟
刘振军
许鲁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Zhongke Bluewhale Information Technology Co ltd
Institute of Computing Technology of CAS
Original Assignee
Tianjin Zhongke Bluewhale Information Technology Co ltd
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Zhongke Bluewhale Information Technology Co ltd, Institute of Computing Technology of CAS filed Critical Tianjin Zhongke Bluewhale Information Technology Co ltd
Priority to CN201410076739.0A priority Critical patent/CN103902660B/en
Publication of CN103902660A publication Critical patent/CN103902660A/en
Application granted granted Critical
Publication of CN103902660B publication Critical patent/CN103902660B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/183Provision of network file services by network file servers, e.g. by using NFS, CIFS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5681Pre-fetching or pre-delivering data based on network characteristics

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a system and method for prefetching a file layout through readdir++ in a cluster file system. The system comprises a client side module (1) and a server module (2). The client side module (1) is used for obtaining or returning catalogue reading authorization from the server module (2); after the catalogue reading authentication is obtained, catalogue reading requests are sent to the server module (2); a webpage which is sent by the server module (2) and contains the file layout is stored in a local cache, and when the client side module (1) reads files in a catalogue, the file layout stored in the local cache is used directly. The server module (2) is used for authorizing the catalogue reading authorization to the client side module (1) or recalling the catalogue reading authorization from the client side module (1); when the catalogue reading requests are received, metadata information including the file layout is encapsulated in the webpage, and the webpage is sent to the client side module (1). Therefore, network interaction overheads for acquiring the file layout in the file reading process can be reduced, and the reading access performance of the massive small files can be improved greatly.

Description

In cluster file system by readdir++ prefetch file layout system and its Method
Technical field
The present invention relates to metadata prefetches mechanism in cluster file system, pass through in more particularly to a kind of cluster file system Readdir++ prefetches the system and method for file layout.
Background technology
With the arriving in big data epoch, global metadata information content rapidly increases.In ecommerce, social networks, science There is increasing undersized file in the fields such as calculating.Therefore, efficiently " mass small documents " are managed, there is provided low delay Small documents access service, be new problem of the pendulum in face of distributed file system.
In recent years, metadata and data, services isolating construction have become the main trend of distributed file system.This point It is advantageous in that from structure:Client adopts out-band method DASD, for the access of big file can be obtained Take higher access performance.
But, the situation that small documents are accessed is then entirely different.For small documents, the ratio shared by data access is few, first number It is big according to the ratio for accessing shared.And during client access file data, be required for passing through network interaction first(RPC)It is synchronous to obtain File layout(layout)Can just carry out afterwards, cause single small file operation to postpone excessive.It is particularly same in continuous read-only access During large amount of small documents under one catalogue, client needs continually individually to carry out a hyposynchronous file layout to each small documents Metadata is obtained and accessed, and this causes very big impact for systematic function.
Read catalogue(readdir)It is the operation that catalogue is read in file system, it is therefore an objective to obtain all directory entries in catalogue (entry)Title(name), type(type), inode number(ino)Etc. essential information.
Catalogue licensing scheme(DELEGATION)It is a kind of recallable guarantee that server gives to client.Giving Authorize to recall authorize during, server ensure that operation of other clients to the catalogue is not resulted in file system one The semantic conflict of cause property.Its essence is exactly that server gives client in processing locality reading catalogue, lookup (lookup), opening (open), close(close), read(read), write(write)Ability, without with server interaction.If without catalogue Licensing scheme, then operating to be required to be interacted with server above just can complete, and its time overhead is very big.
Current parallel network file system(pNFS)Using readdir+ technologies, the technology is awarded in reading catalogue and catalogue Once improvement on power manufacturing basis.Except obtaining title, type, index node extra, whole mesh under catalogue are also additionally prefetched The file handle of record item(fh)And file attribute(fattr), this two it is critical that metadata information.Afterwards, if client End needs the metadata information for accessing file under the catalogue, please without the network interaction information for sending acquisition attribute and obtain handle Ask, these metadata informations are directly obtained from local cache, reduce the network interaction carried out with meta data server.
But, in the distributed file system based on block interface, the unit that the follow-up read access operations of client need Data message is not only file attribute and file handle, in addition it is also necessary to the corresponding book physical block number in Documents Logical position, i.e., File layout.The readdir+ technologies that pNFS is used obviously cannot effectively reduce the network interaction information of file layout acquisition and open Pin.
The content of the invention
In order to solve the above problems, it is an object of the present invention to provide passing through readdir++ in a kind of cluster file system The system and method for file layout are prefetched, the network interaction expense that file layout is obtained when reading file can be reduced, can be with Significantly lift the read access performance of mass small documents.
The readdir++ technologies of the present invention are, based on the once improvement of readdir+ technologies, when catalogue is read, not only to prefetch File attribute, file handle etc., and file layout is prefetched, can avoid obtaining the primary network interactive operation of file layout.
For achieving the above object, the present invention proposes to prefetch file layout by readdir++ in a kind of cluster file system System, for prefetching the metadata information including the mass small documents including file layout, so as to quick the little text of magnanimity is read Part, it is characterised in that the system includes:
Client modules(1), for from server module(2)Obtain or give back reading catalogue mandate;When the acquisition reading catalogue After mandate, to the server module(2)Send and read catalog request;By the server module(2)Send containing file layout The page is stored in local cache, the client modules(1)When reading the file under the catalogue, directly using depositing in local cache The file layout of this document of storage;
The server module(2), the metadata information of the small documents that are stored with, for the client modules(1)Authorize or Recall the reading catalogue mandate;When the reading catalog request is received, will also encapsulate including the metadata information including file layout In directory entry, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules(1).
The system that file layout is prefetched by readdir++ in the cluster file system of the present invention, it is characterised in that the visitor Family end module(1)Specifically include:
Client sends network interaction submodule(11), for sending network interaction information to the server module;
Client receives network interaction submodule(12), for receiving the network interaction information that the server module sends;
Customer terminal webpage cache sub-module(13), for depositing the page that the server module sends;
The cache sub-module of client directory cache item and index node(14), for storing directory cache entry and index section Point;
Client parses directory entry submodule(15), in the customer terminal webpage cache sub-module(13)Middle traversal is all The page, and parse metadata information;
Client is submitted to and reads directory information submodule(16), for submitting to directory information is read;
Client operation behavior triggers submodule(17), read catalogue, lookup, open and shutoff operation for triggering;
Client directory authorisation process submodule(18), obtain catalogue mandate, award catalogue to be recalled for checking whether Power is extractd from the catalogue chained list for having authorized;
Client file layout management module(19), it is for increasing and decreasing file layout reference count, file layout is right with it Existing file layout merges in the index node answered, and according to file layout, in reading data disk in corresponding data Hold.
The system that file layout is prefetched by readdir++ in the cluster file system of the present invention, it is characterised in that the clothes Business device module(2)Specifically include:
Received server-side network interaction submodule(21), believe for receiving the network interaction sent by the client modules Breath request;
Server end obtains file layout submodule(22), for obtaining file layout and this document layout being compiled Code;
Server end sends network interaction submodule(23), for according to the network interaction information type for receiving, to visitor Family end module makes corresponding response;
Server end catalogue mandate submodule(24), it is used for into authorizing for column catalogue mandate or recalling for catalogue mandate.
The system that file layout is prefetched by readdir++ in the group of planes file of the present invention, it is characterised in that in client Page cache submodule(13)In include multiple pages, the organizational form of the page is:
Storing directory item in order in each page, the page using page index number as indexing foundation, in each page Directory entry number, if the remaining space of current page is not enough to deposit next directory entry, using a new page Face, until all directory entries are deposited in page-in in the catalogue, is come at the ending of each directory entry using ending mark Record last directory entry that whether directory entry is this page and be whether last directory entry of this catalogue.
The system that file layout is prefetched by readdir++ in the cluster file system of the present invention, it is characterised in that
The directory entry includes inode number, title, file handle, file attribute, file layout and ending mark;Its In, the inode number reads directory information for submitting to;The title, for building Directory caching item;This document handle and file Attribute, for building the index node.
A kind of method for prefetching the system of file layout in cluster file system as above by readdir++, its It is characterised by, the method is comprised the following steps:
Step 1, client modules obtain the reading catalogue mandate of file directory from server module;
Directory operation is read in step 2, client modules triggering, and whole directory entries in the catalogue are obtained from server module Information, including inode number, title, file handle, file attribute, file layout;
Step 3, client modules triggering file open operation, and judge whether the Directory caching item of local this document is deposited It is the cache sub-module of client directory cache item and index node(14)In;If there is step 4 is then entered, if do not deposited Then entering step 5;
The operation of the Directory caching item of this document is searched in step 4, client modules triggering, is obtained during the search operation Directory caching item and its file layout are taken, and the reference count to this document layout increases 1, then according to the file for having obtained Layout, reads data content, into step 6 from data disk with the file layout correspondence position;
Step 5, client modules parse corresponding directory entry in page cache, and using metadata information structure therein Directory caching item is built, and the reference count to file layout increases 1, then according to the file layout for having obtained, from data magnetic Data content is read in disk with the file layout correspondence position;
Step 6, client modules triggering file close operation, the reference count to this document layout subtracts 1;
Step 7, client modules check whether the reference count of the file layout of the All Files in the catalogue is all 0, If being not all 0, illustrate that file is not turned off, needed to wait for its closing;If 0, then reading can be given back to server module Catalogue mandate.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed In the step 1 is further comprising the steps:
Step 11, client sends network interaction submodule(11)Network interaction information, Shen are sent to the server module Please the catalogue reading catalogue mandate;
Step 12, received server-side network interaction submodule(21)Network interaction information is received, judges the request for mesh Record authorization requests;
Step 13, server end catalogue mandate submodule(24)The request is processed, the result whether authorized is given;
Step 14, server end sends network interaction submodule(23)Send result to client;
Step 15, client receives network interaction submodule(12)Whether receive that the server module sends authorizes the mesh The result of the reading catalogue mandate of record;If authorized, client directory authorisation process submodule is notified(18), into step 16;If there is no mandate, resumes step 11;
Step 16, client directory authorisation process submodule(18)By the catalogue record in acquired reading catalogue mandate chain In table.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed In the step 2 is further comprising the steps:
Step 21, client operation behavior triggering submodule(17)Directory operation, customer terminal webpage caching submodule are read in triggering Block(13)The page that each directory entry in catalogue is located is found in local page caching;If do not found, into step 22;If it is found, then entering step 27;
Step 22, client sends network interaction submodule(11)Network interaction information, Shen are sent to the server module Please read directory information;
Step 23, received server-side network interaction submodule(21)Receive network interaction information;
Step 24, server end obtains file layout submodule(22)Obtain the file layout of assigned catalogue item;
Step 25, server end sends network interaction submodule(23)To seal including the metadata information including file layout In loaded on directory entry, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules;
Step 26, client receives network interaction submodule(12)The page that server end is sent is received, client is transferred to End page face cache sub-module(13)Preserve;
Step 27, client parsing directory entry submodule(15)Parse directory entry, name one by one in the page of local cache Claim to be used for building Directory caching item, file handle and file attribute are used for index building node, by the new Directory caching item for building With the cache sub-module that index node transfers to client directory cache item and index node(14)Preserve;
Step 28, client file layout management module(19)The file cloth that will deposit in file layout and index node Office merges;
Step 29, client is submitted to and reads directory information submodule(16)By title, inode number, type, reading is submitted to Directory operation, mark was done the mark position 1 of reading directory operation in parent directory index node.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed In the step 3 is further comprising the steps:
Step 31, client operation behavior triggering submodule(17)Triggering file open operation;
Step 32, client directory authorisation process submodule(18)Check whether that catalogue mandate is read in acquisition, if it is, then Into step 33;If there is no, into step 34;
Step 33, client is submitted to and reads directory information submodule(16)Check whether the flag bit is 1;If 1, represent Reading directory operation was done, then into step 35;If 0, then it represents that do not did reading directory operation, then into step 34;
Step 34, client sends network interaction submodule(11)Network interaction information is sent, according to without catalogue mandate Flow process completes opening operation;
The cache sub-module of step 35, client directory cache item and index node(14)Check file to be opened slow Corresponding Directory caching item whether there is in depositing;If it is present into step 4;If it does not exist, then into step 5.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed In the step 4 is further comprising the steps:
Step 41, client operation behavior triggering submodule(17)The operation of the Directory caching item of this document is searched in triggering;
The cache sub-module of step 42, client directory cache item and index node(14)Directly the catalogue in caching is delayed Credit balance returns to opening operation;
Step 43, client file layout management module(19)File layout reference count is increased into 1, then basis is obtained The file layout from data disk with the file layout correspondence position read data content.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed In the step 5 is further comprising the steps:
Step 51, client parsing directory entry submodule(15)Traversal finds corresponding directory entry in page cache, Fit like a glove with the title of file to be opened for the title in directory entry with condition, if do not found, into step 52;Such as Fruit is found, then into step 53;
Step 52, client operation behavior triggering submodule(17)Error message is assigned to into directory entry, into step 56;
Step 53, client parsing directory entry submodule(15)Other full details of corresponding directory entry are parsed, including File handle, file attribute, file layout;File handle and file attribute are used for index building node, index node and catalogue Cache entry is associated;
The cache sub-module of step 54, client directory cache item and index node(14)Storing directory cache entry and index Node;
Step 55, by Directory caching item opening operation is returned to.
Step 56, client file layout management module(19)The file that will deposit in file layout and the index node Layout merges, and file layout reference count increases 1, then according to the file layout that obtained from data disk with the text Part layout correspondence position reads data content;
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed In the step 6 is further comprising the steps:
Step 61, client operation behavior triggering submodule(17)Triggering file close operation;
Step 62, client file layout management module(19)Reference count to this document layout subtracts 1.
The method for being prefetched the system of file layout in the cluster file system of the present invention by readdir++, its feature is existed In the step 7 is further comprising the steps:
Step 71, server end catalogue mandate submodule(24)Catalogue mandate is recalled in decision;
Step 72, server end sends network interaction submodule(23)Network interaction information is sent to the client modules, Notify its release catalogue mandate;
Step 73, client receives network interaction submodule(12)Receive catalogue mandate recall notice;
Step 74, client file layout management module(19)The file layout for checking All Files in the catalogue quotes meter Whether number is all 0;If being all 0, into step 75;If being not all 0, wait;
Step 75, client directory authorisation process submodule(18)By catalogue to be recalled from the catalogue chained list for having authorized Extract;
Step 76, customer terminal webpage cache sub-module(13)With client directory cache item and the caching submodule of index node Block(14)Local cache is removed, file layout is eliminated simultaneously;
Step 77, client sends network interaction submodule(11)Network interaction information is sent, notifies that server end is returned Also read catalogue mandate.
The positive effect of the present invention is:
The system and method for file layout are prefetched by readdir++ in cluster file system proposed by the invention In, client modules complete to read after directory operation, reduce follow-up reading in file content operation and obtain file layout Network interaction expense, and while other operations(Including opening, close, search and read again catalogue etc.)Also need not be by first number According to access, directly can locally complete in client, metadata network interaction expense is saved completely.Survey through contrast properties Examination, this kind of method can take cost in mass small documents applied environment by minimum client-cache, exchange for significantly Read access performance boost.
Description of the drawings
Fig. 1 is the structural representation of the system that file layout is prefetched by readdir++ of the present invention;
Fig. 2 is the structural representation of the organizational form of the page cache of the present invention;
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, below in conjunction with accompanying drawing to the present invention's The system and method for prefetching file layout by readdir++ in cluster file system are further elaborated.Should Understand, specific embodiment described herein only to explain the present invention, is not intended to limit the present invention.
First, the system to prefetching file layout by readdir++ in the cluster file system of the present invention is illustrated, The system architecture is as shown in Figure 1.
The system includes client modules 1 and server module 2, wherein, documentary first number is deposited in the server module 2 It is believed that breath.In above-mentioned module, further, client modules include following 9 submodules:
Client sends network interaction submodule 11:
Network interaction information is sent for client modules to server module(RPC).The field being related in the present invention Jing You:Application is read catalogue mandate, request and reads directory information, opening operation, gives back reading catalogue mandate.
Client receives network interaction submodule 12:
For the network interaction information that client modules the reception server module sends.The scene being related in the present invention Have:Application is read catalogue mandate, request and reads directory information.The return information that wherein catalogue mandate is read in application is integer variable 1 or 0, To indicate whether to be authorized;It is the page that the return information of directory information is read in request, and client modules are directly stored in it In local page caching.
Customer terminal webpage cache sub-module 13:
Client local cache is made up of two parts:A part is related in customer terminal webpage cache sub-module 13 Page cache, the inside is used to deposit the page;And another part is the cache sub-module of client directory cache item and index node The Directory caching item being related in 14 and the caching of index node, the inside is only used for storing directory cache entry and index node.
The organizational form of page cache is as shown in Figure 2:
Each page the inside storing directory item in order, the page is using page index number as the foundation for indexing.Due to title Length is different, file attribute is not quite similar comprising attribute type, therefore catalogue item size, without fixed numbers, this is resulted in The directory entry number that can be accommodated in each page.If the remaining space of current page is not enough to deposit next catalogue , then using a new page, until all directory entries are deposited in page-in in the catalogue.The ending of each directory entry Place is using ending mark(eof)It is to record last directory entry and the directory entry that whether the directory entry is this page It is no for last directory entry in this catalogue.
If it should be strongly noted that client obtains catalogue mandate, can ensure that by reading what directory operation was obtained The page is effective.Once catalogue mandate is called back, the information of page cache is no longer valid, it is therefore desirable to remove page cache.
The cache sub-module 14 of client directory cache item and index node:
When catalogue is read, the title for prefetching(name)For building Directory caching item(dentry), file handle(fh)With File attribute(fattr)For index building node(inode), file layout(layout)With existing file cloth in index node Office merges.Directory caching item and index node are associated after operation, then can be deposited Directory caching item and index node In the buffer.
Have always with the caching of index node it should be noted that obtaining catalogue mandate and might not represent Directory caching item Effect, because do not use also resulting in the caching that Directory caching item and index node are removed in client timing for a long time.In catalogue In the case of cache entry is effective, search(lookup)Operation directly returns correspondence Directory caching item;And in Directory caching item catalogue In the case that item is invalid, then the information in effective page cache can be utilized, Directory caching item and index node are set up again. Both the above situation need not carry out the network interaction with meta data server.
Client parsing directory entry submodule 15:
Module travels through all pages in page cache, in each page, all directory entries is traveled through one by one, and therefrom solves Separate out full content, i.e. inode number, title, file handle, file attribute, file layout etc..The applied field of the submodule Scape has at two, respectively:Read catalogue(readdir)And lookup(lookup).
Client is submitted to and reads directory information submodule 16:
The module is the core of readdir+ original flow processs, client by the title of all directory entries under catalogue, type, The essential informations such as inode number are returned, so as to reach the purpose for reading catalogue.Client is tieed up in the index node of parent directory Hold flag bit plusplus_done to indicate whether to do reading directory operation.
Client operation behavior triggers submodule 17:
In the present invention, the operation that client is related to has reading catalogue(readdir), search(lookup), open (open)And close(close)Operation.
Client directory authorisation process submodule 18:
One important foundation of the present invention is that client is obtained and reads catalogue mandate, so just can guarantee that in local cache The validity of information, also can just be such that operation completely locally carries out, and save RPC interactions.Client in local maintenance two The catalogue chained list of catalogue mandate is obtained, respectively reads catalogue mandate chained list and read-write catalogue mandate chained list.Award when catalogue is got Temporary, by the direct insertion to corresponding chained list end;When catalogue mandate is reclaimed, it is deleted from corresponding chained list.
Client file layout management module 19:
The module major responsibility is the increase and decrease of file layout reference count, by file layout and index node existing file cloth Office merges, and according to file layout, reads corresponding data content in data disk.File layout is mainly by following several parts Constitute:Starting, length, type, pattern, numbering.
In above-mentioned server module, further, server end module includes following 4 submodules:
Received server-side network interaction submodule 21:
The network interaction information request that received server-side is sent by client, judges request type.
Server end obtains file layout submodule 22:
The function of the module is as the mode of file layout is obtained in the readdir+ technologies introduced in background technology.
Server end sends network interaction submodule 23:
Server provides response according to the network interaction information type for receiving.If request is readdir++, server The file layout that end will get, together with information such as file handle, file attribute, titles, in being encapsulated in directory entry.Again by catalogue Item is encapsulated in order in the page, and by network interaction information client is sent to.If request is catalogue mandate, being given is The result of no mandate.
Server end catalogue mandate submodule 24:
Server end authorizes client directory mandate, is that client can keep file layout in caching effectively basic. The module is mainly made up of two parts function:A part is authorizing for catalogue mandate, and another part is recalling for catalogue mandate.Clothes Business device end is the catalogue mandate chained list that each directory maintenance one has been authorized, represents which client obtains awarding for the catalogue Power.It should be noted that synchronization only has 1 or 0 client holds the read-write catalogue mandate to certain catalogue, and can have Some clients hold reading catalogue mandate to certain catalogue.
Below the method for the system to prefetching file layout by readdir++ in the cluster file system of the present invention is carried out Explanation.
The method is comprised the following steps:
Step 1, client modules obtain the reading catalogue mandate of file directory from server module;
Directory operation is read in step 2, client modules triggering, and whole directory entries in the catalogue are obtained from server module Information, including inode number, title, file handle, file attribute, file layout;
Step 3, client modules triggering file open operation, and judge whether the Directory caching item of local this document is deposited Be client directory cache item with the cache sub-module 14 of index node in;If there is then enter step 4, if there is no Then enter step 5;
The operation of the Directory caching item of this document is searched in step 4, client modules triggering, is obtained during the search operation Directory caching item and its file layout are taken, and the reference count to this document layout increases 1, then according to the file for having obtained Layout, reads data content, into step 6 from data disk with the file layout correspondence position;
Step 5, client modules parse corresponding directory entry in page cache, and using metadata information structure therein Directory caching item is built, and the reference count to file layout increases 1, then according to the file layout for having obtained, from data magnetic Data content is read in disk with the file layout correspondence position;
Step 6, client modules triggering file close operation, the reference count to this document layout subtracts 1;
Step 7, client modules check whether the reference count of the file layout of the All Files in the catalogue is all 0, If being not all 0, illustrate that file is not turned off, needed to wait for its closing;If 0, then reading can be given back to server module Catalogue mandate.
Wherein, the step 1 is further comprising the steps:
Step 11, client sends network interaction submodule 11 and sends network interaction information, application to the server module The reading catalogue mandate of the catalogue;
Step 12, received server-side network interaction submodule 21 receives network interaction information, judges the request for catalogue Authorization requests;
Step 13, server end catalogue mandate submodule 24 processes the request, provides the result whether authorized;
Step 14, server end sends network interaction submodule 23 and sends result to client;
Whether step 15, what client received that network interaction submodule 12 receives that the server module sends authorizes the catalogue Reading catalogue mandate result;If authorized, client directory authorisation process submodule 18 is notified, into step 16; If there is no mandate, resumes step 11;
Step 16, client directory authorisation process submodule 18 is by the catalogue record in acquired reading catalogue mandate chained list In.
The step 2 is further comprising the steps:
Directory operation, customer terminal webpage cache sub-module are read in step 21, the client operation behavior triggering triggering of submodule 17 13 find the page that each directory entry in catalogue is located in local page caching;If do not found, into step 22;Such as Fruit is found, then into step 27;
Step 22, client sends network interaction submodule 11 and sends network interaction information, application to the server module Read directory information;
Step 23, received server-side network interaction submodule 21 receives network interaction information;
Step 24, server end obtains the file layout that file layout submodule 22 obtains assigned catalogue item;
Step 25, server end sends network interaction submodule 23 and will encapsulate including the metadata information including file layout In directory entry, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules;
Step 26, client receives network interaction submodule 12 and receives the page that server end is sent, and transfers to client Page cache submodule 13 is preserved;
Step 27, client parsing directory entry submodule 15 parses one by one directory entry, title in the page of local cache For building Directory caching item, file handle and file attribute are used for index building node, by the new Directory caching item for building and Index node transfers to client directory cache item to preserve with the cache sub-module 14 of index node;
Step 28, the file layout that client file layout management module 19 will have been deposited in file layout and index node Merge;
Step 29, client is submitted to and reads directory information submodule 16 by title, inode number, type, submits to reading mesh Record operation, mark was done the flag bit of reading directory operation in parent directory index node(plusplus_done)Put 1.
The step 3 is further comprising the steps:
Step 31, the client operation behavior triggering triggering file open operation of submodule 17;
Step 32, client directory authorisation process submodule 18 checks whether that catalogue mandate is read in acquisition, if it is, then entering Enter step 33;If there is no, into step 34;
Step 33, client submits to reading directory information submodule 16 to check whether the flag bit is 1;If 1, represent Jing did reading directory operation, then into step 35;If 0, then it represents that do not did reading directory operation, then into step 34;
Step 34, client sends network interaction submodule 11 and sends network interaction information, according to the stream without catalogue mandate Journey completes opening operation;
Step 35, client directory cache item checks file to be opened in caching with the cache sub-module 14 of index node In corresponding Directory caching item whether there is;If it is present into step 4;If it does not exist, then into step 5.
The step 4 is further comprising the steps:
The operation of the Directory caching item of this document is searched in step 41, the client operation behavior triggering triggering of submodule 17;
Step 42, client directory cache item is direct with the cache sub-module 14 of index node by the Directory caching in caching Item returns to opening operation;
File layout reference count is increased 1 by step 43, client file layout management module 19, and then basis has been obtained The file layout reads data content from data disk with the file layout correspondence position.
The step 5 is further comprising the steps:
Step 51, the traversal in page cache of client parsing directory entry submodule 15 finds corresponding directory entry, matching Condition is that the title in directory entry fits like a glove with the title of file to be opened, if do not found, into step 52;If Find, then into step 53;
Error message is assigned to directory entry by step 52, client operation behavior triggering submodule 17, into step 56;
Step 53, client parsing directory entry submodule 15 parses other full details of corresponding directory entry, including text Part handle, file attribute, file layout;File handle and file attribute are used for index building node, and index node delays with catalogue Credit balance is associated;
Step 54, client directory cache item is saved with the storing directory cache entry of cache sub-module 14 and index of index node Point;
Step 55, by Directory caching item opening operation is returned to.
Step 56, the file cloth that client file layout management module 19 will have been deposited in file layout and the index node Office merges, and file layout reference count increases 1, then according to the file layout that obtained from data disk with the file Layout correspondence position reads data content;
The step 6 is further comprising the steps:
Step 61, the client operation behavior triggering triggering file close operation of submodule 17;
Step 62, reference count of the client file layout management module 19 to this document layout subtracts 1.
The step 7 is further comprising the steps:
Step 71, server end catalogue mandate submodule 24 determines to recall catalogue mandate;
Step 72, server end sends network interaction submodule 23 and sends network interaction information to the client modules, leads to Know its release catalogue mandate;
Step 73, client receives network interaction submodule 12 and receives catalogue mandate recall notice;
Step 74, client file layout management module 19 checks the file layout reference count of All Files in the catalogue Whether 0 is all;If being all 0, into step 75;If being not all 0, wait;
Step 75, client directory authorisation process submodule 18 plucks catalogue to be recalled from the catalogue chained list for having authorized Remove;
The cache sub-module of step 76, customer terminal webpage cache sub-module 13 and client directory cache item and index node 14 remove local cache, and file layout is eliminated simultaneously;
Step 77, client sends network interaction submodule 11 and sends network interaction information, notifies that server end is given back Read catalogue mandate.

Claims (13)

1. the system for prefetching file layout by readdir++ in a kind of cluster file system, for quick the little text of magnanimity is read Part, it is characterised in that the system includes:
Client modules(1), for from server module(2)Obtain or give back reading catalogue mandate;When the acquisition reading catalogue mandate Afterwards, to the server module(2)Send and read catalog request;By the server module(2)The page containing file layout for sending In being stored in local cache, the client modules(1)When reading the file under the catalogue, directly using storage in local cache The file layout of this document;
The server module(2), the metadata information of the small documents that are stored with, for the client modules(1)Authorize or recall The reading catalogue mandate;When the reading catalog request is received, also mesh will be packaged in including the metadata information including file layout In record item, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules(1).
2. the system for prefetching file layout by readdir++ in cluster file system as claimed in claim 1, its feature exists In the client modules(1)Specifically include:
Client sends network interaction submodule(11), for sending network interaction information to the server module;
Client receives network interaction submodule(12), for receiving the network interaction information that the server module sends;
Customer terminal webpage cache sub-module(13), for depositing the page that the server module sends;
The cache sub-module of client directory cache item and index node(14), for storing directory cache entry and index node;
Client parses directory entry submodule(15), in the customer terminal webpage cache sub-module(13)It is middle to travel through all pages Face, and parse metadata information;
Client is submitted to and reads directory information submodule(16), for submitting to directory information is read;
Client operation behavior triggers submodule(17), read catalogue, lookup, open and shutoff operation for triggering;
Client directory authorisation process submodule(18), for check whether obtain catalogue mandate, will catalogue mandate be recalled from Extract in the catalogue chained list for having authorized;
Client file layout management module(19), it is for increasing and decreasing file layout reference count, file layout is corresponding Existing file layout merges in index node, and according to file layout, reads corresponding data content in data disk.
3. the system for prefetching file layout by readdir++ in cluster file system as claimed in claim 1, its feature exists In the server module(2)Specifically include:
Received server-side network interaction submodule(21), please for receiving the network interaction information sent by the client modules Ask;
Server end obtains file layout submodule(22), for obtaining file layout and this document layout being encoded;
Server end sends network interaction submodule(23), for according to the network interaction information type for receiving, to client Module makes corresponding response;
Server end catalogue mandate submodule(24), it is used for into authorizing for column catalogue mandate or recalling for catalogue mandate.
4. the system for prefetching file layout by readdir++ in cluster file system as claimed in claim 2, its feature exists In in customer terminal webpage cache sub-module(13)In include multiple pages, the organizational form of page cache is:
The interior storing directory item in order of each page, the page is using page index number as index foundation, the catalogue in each page Number, if the remaining space of current page is not enough to deposit next directory entry, using a new page, directly All directory entries are deposited in page-in into the catalogue, and this is recorded using ending mark at the ending of each directory entry Directory entry be whether last directory entry of this page and be whether this catalogue last directory entry.
5. the system for prefetching file layout by readdir++ in the cluster file system as described in claim 1 or 4, it is special Levy and be,
The directory entry includes inode number, title, file handle, file attribute, file layout and ending mark;Wherein, The inode number, for submitting to directory information is read;The title, for building Directory caching item;This document handle and file belong to Property, for building the index node.
6. the system of file layout is prefetched by readdir++ in a kind of cluster file system as described in claim 1-5 Method, it is characterised in that the method is comprised the following steps:
Step 1, client modules obtain the reading catalogue mandate of file directory from server module;
Directory operation is read in step 2, client modules triggering, and the letter of the whole directory entries in the catalogue is obtained from server module Breath, including inode number, title, file handle, file attribute, file layout;
Step 3, client modules triggering file open operation, and judge local this document Directory caching item whether there is in The cache sub-module of client directory cache item and index node(14)In;If there is step 4 is then entered, if there is no then Into step 5;
The operation of the Directory caching item of this document is searched in step 4, client modules triggering, and during the search operation mesh is obtained Record cache entry and its file layout, and the reference count to this document layout increases 1, then according to the file layout for having obtained Data content is read with the file layout correspondence position from data disk, into step 6;
Step 5, client modules parse corresponding directory entry in page cache, and build mesh using metadata information therein Record cache entry, and 1 is increased to the reference count of file layout, then according to the file layout for having obtained from data disk with The file layout correspondence position reads data content;
Step 6, client modules triggering file close operation, the reference count to this document layout subtracts 1;
Step 7, client modules check whether the reference count of the file layout of the All Files in the catalogue is all 0, if 0 is not all, then explanation has file to be not turned off, needs to wait for its closing;If 0, then reading catalogue can be given back to server module Authorize.
7. the method for prefetching the system of file layout in cluster file system as claimed in claim 6 by readdir++, its It is characterised by, the step 1 is further comprising the steps:
Step 11, client sends network interaction submodule(11)Network interaction information is sent to the server module, application should The reading catalogue mandate of catalogue;
Step 12, received server-side network interaction submodule(21)Network interaction information is received, judges that the request is awarded for catalogue Power request;
Step 13, server end catalogue mandate submodule(24)The request is processed, the result whether authorized is given;
Step 14, server end sends network interaction submodule(23)Send result to client;
Step 15, client receives network interaction submodule(12)Whether receive that the server module sends authorizes the catalogue Read the result of catalogue mandate;If authorized, client directory authorisation process submodule is notified(18), into step 16; If there is no mandate, resumes step 11;
Step 16, client directory authorisation process submodule(18)By the catalogue record in acquired reading catalogue mandate chained list In.
8. the method for prefetching the system of file layout in cluster file system as claimed in claim 6 by readdir++, its It is characterised by, the step 2 is further comprising the steps:
Step 21, client operation behavior triggering submodule(17)Directory operation, customer terminal webpage cache sub-module are read in triggering (13)The page that each directory entry in catalogue is located is found in local page caching;If do not found, into step 22; If it is found, then entering step 27;
Step 22, client sends network interaction submodule(11)Network interaction information is sent to the server module, application is read Directory information;
Step 23, received server-side network interaction submodule(21)Receive network interaction information;
Step 24, server end obtains file layout submodule(22)Obtain the file layout of assigned catalogue item;
Step 25, server end sends network interaction submodule(23)To be packaged in including the metadata information including file layout In directory entry, and the directory entry is encapsulated in order in the page, the page is sent to into the client modules;
Step 26, client receives network interaction submodule(12)The page that server end is sent is received, client's end page is transferred to Face cache sub-module(13)Preserve;
Step 27, client parsing directory entry submodule(15)Parse directory entry one by one in the page of local cache, title is used To build Directory caching item, file handle and file attribute are used for index building node, by the new Directory caching item for building and rope Draw the cache sub-module that node transfers to client directory cache item and index node(14)Preserve;
Step 28, client file layout management module(19)The file layout deposited in file layout and index node is closed And;
Step 29, client is submitted to and reads directory information submodule(16)By title, inode number, type, reading catalogue is submitted to Operation, mark was done the mark position 1 of reading directory operation in parent directory index node.
9. the method for prefetching the system of file layout in cluster file system as claimed in claim 6 by readdir++, its It is characterised by, the step 3 is further comprising the steps:
Step 31, client operation behavior triggering submodule(17)Triggering file open operation;
Step 32, client directory authorisation process submodule(18)Check whether that catalogue mandate is read in acquisition, if it is, then entering Step 33;If there is no, into step 34;
Step 33, client is submitted to and reads directory information submodule(16)Check whether the flag bit is 1;If 1, represent Reading directory operation was done, then into step 35;If 0, then it represents that do not did reading directory operation, then into step 34;
Step 34, client sends network interaction submodule(11)Network interaction information is sent, according to the flow process without catalogue mandate Complete opening operation;
The cache sub-module of step 35, client directory cache item and index node(14)Check file to be opened in the buffer Corresponding Directory caching item whether there is;If it is present into step 4;If it does not exist, then into step 5.
10. the method for prefetching the system of file layout in cluster file system as claimed in claim 6 by readdir++, Characterized in that, the step 4 is further comprising the steps:
Step 41, client operation behavior triggering submodule(17)The operation of the Directory caching item of this document is searched in triggering;
The cache sub-module of step 42, client directory cache item and index node(14)Directly by the Directory caching item in caching Return to opening operation;
Step 43, client file layout management module(19)File layout reference count is increased into 1, then according to the institute for having obtained State file layout and read data content with the file layout correspondence position from data disk.
The method for prefetching the system of file layout in 11. cluster file systems as claimed in claim 6 by readdir++, Characterized in that, the step 5 is further comprising the steps:
Step 51, client parsing directory entry submodule(15)Traversal finds corresponding directory entry in page cache, matches bar Part is that the title in directory entry fits like a glove with the title of file to be opened, if do not found, into step 52;If looked for Arrive, then into step 53;
Step 52, client operation behavior triggering submodule(17)Error message is assigned to into directory entry, into step 56;
Step 53, client parsing directory entry submodule(15)Parse other full details of corresponding directory entry, including file Handle, file attribute, file layout;File handle and file attribute are used for index building node, index node and Directory caching Item is associated;
The cache sub-module of step 54, client directory cache item and index node(14)Storing directory cache entry and index section Point;
Step 55, by Directory caching item opening operation is returned to.
Step 56, client file layout management module(19)The file layout that will deposit in file layout and the index node Merge, file layout reference count increases 1, then according to the file layout that obtained from data disk with the file cloth Office's correspondence position reads data content.
The method for prefetching the system of file layout in 12. cluster file systems as claimed in claim 6 by readdir++, Characterized in that, the step 6 is further comprising the steps:
Step 61, client operation behavior triggering submodule(17)Triggering file close operation;
Step 62, client file layout management module(19)Reference count to this document layout subtracts 1.
The method for prefetching the system of file layout in 13. cluster file systems as claimed in claim 6 by readdir++, Characterized in that, the step 7 is further comprising the steps:
Step 71, server end catalogue mandate submodule(24)Catalogue mandate is recalled in decision;
Step 72, server end sends network interaction submodule(23)Network interaction information is sent to the client modules, is notified Its release catalogue mandate;
Step 73, client receives network interaction submodule(12)Receive catalogue mandate recall notice;
Step 74, client file layout management module(19)Checking the file layout reference count of All Files in the catalogue is It is no to be all 0;If being all 0, into step 75;If being not all 0, wait;
Step 75, client directory authorisation process submodule(18)Catalogue to be recalled is plucked from the catalogue chained list for having authorized Remove;
Step 76, customer terminal webpage cache sub-module(13)With client directory cache item and the cache sub-module of index node (14)Local cache is removed, file layout is eliminated simultaneously;
Step 77, client sends network interaction submodule(11)Network interaction information is sent, notifies that server end has given back reading Catalogue mandate.
CN201410076739.0A 2014-03-04 2014-03-04 System and method for prefetching file layout through readdir++ in cluster file system Expired - Fee Related CN103902660B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410076739.0A CN103902660B (en) 2014-03-04 2014-03-04 System and method for prefetching file layout through readdir++ in cluster file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410076739.0A CN103902660B (en) 2014-03-04 2014-03-04 System and method for prefetching file layout through readdir++ in cluster file system

Publications (2)

Publication Number Publication Date
CN103902660A CN103902660A (en) 2014-07-02
CN103902660B true CN103902660B (en) 2017-04-12

Family

ID=50993982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410076739.0A Expired - Fee Related CN103902660B (en) 2014-03-04 2014-03-04 System and method for prefetching file layout through readdir++ in cluster file system

Country Status (1)

Country Link
CN (1) CN103902660B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933144B (en) * 2015-06-19 2018-03-30 中国科学院计算技术研究所 Ensure the system and method for data validity in a kind of parallel network file system
CN105095353B (en) * 2015-06-19 2018-12-04 中国科学院计算技术研究所 The equal system and method that does after small documents is pre-read in a kind of parallel network file system
CN105119955B (en) * 2015-07-09 2018-10-09 中国科学院计算技术研究所 The method and system that catalogue multipage is supported are read in a kind of distributed file system
CN109947719B (en) * 2019-03-21 2022-10-11 昆山九华电子设备厂 Method for improving efficiency of cluster reading directory entries under directory
CN112286897B (en) * 2020-10-10 2023-01-10 苏州浪潮智能科技有限公司 Method for communication between PNFS server and client
CN113485639B (en) * 2021-06-18 2024-02-20 济南浪潮数据技术有限公司 IO speed optimization method, system, terminal and storage medium for distributed storage
CN113608694B (en) * 2021-07-27 2024-03-19 北京达佳互联信息技术有限公司 Data migration method, information processing method, device, server and medium
CN114003562B (en) * 2021-12-29 2022-03-22 苏州浪潮智能科技有限公司 Directory traversal method, device and equipment and readable storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103179185A (en) * 2012-12-25 2013-06-26 中国科学院计算技术研究所 Method and system for creating files in cache of distributed file system client

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103179185A (en) * 2012-12-25 2013-06-26 中国科学院计算技术研究所 Method and system for creating files in cache of distributed file system client

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Research on Implement Snapshot of pNFS Distributed File System;Liu-Chao等;《Applied Mathematics & Information Sciences》;20110331;第5卷(第2期);全文 *
并行网络文件系统数据管理技术的研究与实现;冯振乾;《中国优秀硕士学位论文全文数据库信息科技辑》;20090715;全文 *

Also Published As

Publication number Publication date
CN103902660A (en) 2014-07-02

Similar Documents

Publication Publication Date Title
CN103902660B (en) System and method for prefetching file layout through readdir++ in cluster file system
US11625501B2 (en) Masking sensitive information in records of filtered accesses to unstructured data
CN104243425B (en) A kind of method, apparatus and system carrying out Content Management in content distributing network
CN106886375B (en) The method and apparatus of storing data
US8447827B2 (en) Providing local access to managed content
US9396290B2 (en) Hybrid data management system and method for managing large, varying datasets
US9491104B2 (en) System and method for storing/caching, searching for, and accessing data
CN103020315B (en) A kind of mass small documents storage means based on master-salve distributed file system
US10108653B2 (en) Concurrent reads and inserts into a data structure without latching or waiting by readers
CN104794190B (en) The method and apparatus that a kind of big data effectively stores
CN108021717B (en) Method for implementing lightweight embedded file system
Serres et al. Ultrastructural morphometry of the human sperm flagellum with a stereological analysis of the lengths of the dense fibres
US9262511B2 (en) System and method for indexing streams containing unstructured text data
CN111046041B (en) Data processing method and device, storage medium and processor
CN114490527B (en) Metadata retrieval method, system, terminal and storage medium
US10664508B1 (en) Server-side filtering of unstructured data items at object storage services
CN108647266A (en) A kind of isomeric data is quickly distributed storage, exchange method
CN102984256A (en) Processing method and system for metadata based on authorization manner
Changtong An improved HDFS for small file
CN104021137A (en) Method and system for opening and closing file locally through client side based on catalogue authorization
CN105915619A (en) Access heat regarded cyber space information service high performance memory caching method
CN103136294B (en) File operating method and device
CN112650711A (en) Massive small file storage method based on Redis and HDFS
CN110020373A (en) The method and apparatus that static page is stored, browsed
CN103646034A (en) Web search engine system and search method based content credibility

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170412