CN101149755A - Distributed file system file writing system and method - Google Patents
Distributed file system file writing system and method Download PDFInfo
- Publication number
- CN101149755A CN101149755A CNA2007101679005A CN200710167900A CN101149755A CN 101149755 A CN101149755 A CN 101149755A CN A2007101679005 A CNA2007101679005 A CN A2007101679005A CN 200710167900 A CN200710167900 A CN 200710167900A CN 101149755 A CN101149755 A CN 101149755A
- Authority
- CN
- China
- Prior art keywords
- descriptor
- predistribution
- file destination
- piece
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This invention relates to a distributed file writing system and method, can direct use the preassignment information in the cache to distribute block data when the memory sever need to distribute block data for the targeted document.. When more than one client distribute block data for their own document at the same time, the memory sever end can exert the advantage of preassignment mechanism to ensure that all target file data are saved in the disk continuous as far as possible, avoid been staggered. The present invention can solve the problem that when more than one client distribute block data for their own document at the same time, data blocks are staggered the memory sever distribute for the targeted document, the block data of each file is discontinuous distributed on the disk.
Description
Technical field
The present invention relates to the file writing system and the method in Computer Storage field, particularly a kind of distributed file system.
Background technology
The structure that a group of planes (cluster) structure is made up of interconnected a plurality of stand-alone computer, these computing machines can be unit or multicomputer system (PC, workstation or SMP), and each computing machine all has storer, the I/O the device and operating system of oneself.Group of planes structure is a single system to user and application, and it can provide high performance environments and rapid and reliable service efficiently at a low price.Because group of planes structure has the advantage of high performance-price ratio, thereby it has become the main flow structure of high-performance calculation at present.
In Network of Workstation, jumbo memory device often is equipped with, when System Operation, need manage these equipment.Simultaneously, Network of Workstation also needs to provide good file-sharing service for the user of different cluster nodes.Distributed file system provides these services for group of planes structure, and it integrates all memory devices in the Network of Workstation, sets up a unified name space (institutional framework of file and catalogue).The bibliographic structure of the distributed file system that each node is seen is consistent, and the user of different nodes can adopt the identical file of transparent way visit.Data in the distributed file system not necessarily are stored in the disk of this node, thereby all can be provided with special-purpose storage server usually.To be written as example, when application process was passed through the distributed file system client write data, client at first was sent to the storage server end with data by network, and storage server is write the data that receive in the disk of this node again.
In (SuSE) Linux OS, process to the browsing process of file is: (1) open (2) read/write (3) read/write (4) read/write... (n) close.That is, process at first will open a file by open action before access file, obtained a filec descriptor fd.After opening file, process can be called the read/write function as parameter with accessed filec descriptor fd and carry out read-write operation.After read-write finished, process also needed to operate close file by close.
In file system, the layout of file storage and I/O performance are closely related.In order to improve the I/O performance of entire system, the existing file system all tries one's best when the distribute data piece data block of identical file is deposited on disk continuously, does like this can reduce on the one hand file data and write moving of fashionable magnetic head; On the other hand, when reading file, also can give full play to the effect that outfile is read in advance.Yet, when a plurality of processes in the system need to give the file allocation data block that writes separately simultaneously, they will compete the free block in the application system, thereby cause one section continuous free block zone to be distributed to a plurality of files alternately, and then reduced the continuous degree of data block on disk of each file.
At the problems referred to above, the ext3 file system has proposed data block reservation distribution mechanism and has alleviated a plurality of concurrent processes are divided timing at piece mutual interference problem.The ext3 file system needs predistribution status information of file maintenance of distribute data piece for each, in fact be exactly to reserve one section continuous data block zone for each needs the file of distribute data piece, when a plurality of files simultaneously during the request for data piece, from reserved area separately, distribute respectively, avoiding the data block weave in of different files, thereby make each file can both have higher continuous degree.Simultaneously, each needs the predistribution status information of the file of distribute data piece constantly to adjust the access module of this document according to process, such as, when system identification is being certain document order ground distribute data piece to process always, will enlarge the piece reserved window of this document, just reserve the more data piece for this document; On the contrary, if process is not sequentially to be the file allocation data block, system will dwindle the piece reserved window of this document; By above-mentioned adjustment, can farthest bring into play the advantage that piece is reserved distribution mechanism.
Wherein, the piece of each file reserve distribution state information be kept at the corresponding ext3inode structure of this document in, this structure is initialised during for this document distribute data piece first, is in the end destroyed when writing the process close file for one.
In this accessing, directly the file in the ext3 file system being carried out repeatedly write operation only needs once to open and close operation, and its flow process as shown in Figure 1.In whole access process, file system can be managed the predistribution status information always and be come the predistribution data block according to information wherein.Yet, in distributed file system, the file that client will be visited not necessarily is stored in the local disk, but corresponding to a file destination in the storage server end local file system, in the time of on storage server is based upon the ext3 local file system, file destination be exactly an ext3 type local file.
In distributed file system, when storage server end service thread is handled each write request that client sends, all need to open and close file destination, its flow process is as shown in Figure 2.When closing file destination, the piece of file destination is presorted status information and is probably destroyed, thereby causes the predistribution status information not transmit between request.When a plurality of client process are separately file allocation data block simultaneously, storage server can not be each file destination maintenance block predistribution status information constantly, piece predistribution mechanism also just can't be played effectiveness, thereby can cause a plurality of file destination data stored interleaved on disk, and then the reduction data write and the performance of subsequent read operation.
In order to overcome these defectives, available technology adopting a kind of object storage technology.After taking this memory technology, the a certain partial data of file of client or file can corresponding objects storage server end a data object, storage server is when handling the write request of client, not to be write request distribute data piece and write data immediately, but the metadata cache that will write same data object earlier is in internal memory, when this part data aggregate arrives certain length, just once on disk, distribute a bulk of continuum, and then carry out write operation for this data object.By the mode that this delay distributes, can promote the continuation degree of data block in disk of same data object.
But, because the memory amount of system is limited, if wait the data object number of data to be distributed piece to reach certain quantity simultaneously, just will certainly accelerate the frequency of distribute data piece, like this, the length of each consecutive data block of distributing for data object just will reduce.Therefore, under the more situation of load number, this delay allocation strategy can not obtain good effect.In addition, the storage server end of this prior art also needs the support of specialized hardware, and therefore higher to the requirement of system equipment, practicality is not strong yet.
Summary of the invention
The objective of the invention is to, a kind of file writing system and method for distributed file system are provided, it has promoted the write performance of distributed file system, when solving a plurality of client process simultaneously for separately file allocation data block, storage server is that the data block that a plurality of file destinations distribute is interspersed, and the data block of each file is in the disk discontinuous problem that distributes.
To achieve these goals, the invention discloses a kind of file wiring method of distributed file system, may further comprise the steps:
Step 100, distributed file system load store server module, and carry out initialization;
Step 200, the write request that described storage server end sends according to a client in the described distributed file system at a file destination, be utilized as the piece predistribution descriptor of described file destination buffer memory, finish write operation described file destination;
Step 300 encapsulates response message according to the result of said write operation, with response message send to described write request from client.
Preferable, in described step 200, may further comprise the steps:
Step 210, a client in the described distributed file system is sent write request at a file destination to the storage server end;
Step 220, described storage server end obtains the relevant information of described file destination according to the information that comprises in the described write request;
Step 230, described storage server end is according to the file destination information in the client write request, be a piece predistribution of described file destination initialization descriptor, and described predistribution descriptor be buffered in the internal memory of storage server end, the ext3 local file system in the described storage server end is that described file destination is reserved corresponding data block according to described predistribution descriptor;
Step 240, described storage server end is finished the write operation to described file destination;
Step 250 is closed described file destination.
Preferable, in described step 220, may further comprise the steps:
Step 221 according to the information that comprises in the described write request, is resolved the filename that obtains described file destination;
Step 222 is opened described file destination, obtain described file destination i-number number.
Preferable, in described step 250, after closing described file destination, described predistribution descriptor of described file destination continues to be buffered in the described internal memory.
Preferable, in said method, the buffer organization structure of described predistribution descriptor is the hash table.
Hash function in the described hash table is the i-number number function to a hash inlet array indexing value by described file destination;
The address that each hash list item of described hash table has write down the piece predistribution descriptor of described file destination, file system under the described file destination the address of superblock, the reference count of hash list item, service time last time of hash list item, and the i-number of described file destination number.
Preferable, in said method, also comprise step 400, regularly to the execution reclaimer operation of described predistribution descriptor, be released to the data block that described file destination is reserved, and discharge the storage space of described predistribution descriptor.
Preferable, in described step 400, the reference count of judging described hash list item whether equal 0 and the interval of service time last time and current time whether greater than the maximum lifetime of piece predistribution descriptor, if satisfied above-mentioned two conditions simultaneously, then be released to the data block that described file destination is reserved, and discharge the storage space of described predistribution descriptor.
Preferable, in said method, also comprise step 500, the unloading storage server modules.
Preferable, in described step 240, the write operation of finishing described file destination may further comprise the steps:
Step 241 locks to the index node of described file destination;
Step 242, the index node by described file destination finds the privately owned index node of described file destination in the ext3 local file system;
Step 243 judges whether the piece predistribution descriptor pointer of the privately owned index node of ext3 of file destination is NULL, if be not NULL, enters step 246, if be NULL, then enters step 244;
Step 244 is by the key of described predistribution descriptor of i-number number generation in described hash table of described file destination;
Step 245, in described hash table, search, judged whether to described file destination buffer memory piece predistribution descriptor, be the piece predistribution descriptor of described file destination buffer memory if found, then give the privately owned index node piece of file destination ext3 predistribution descriptor pointer, and the reference count of hash node under the described predistribution descriptor is added 1, enter step 246 its assignment, if do not find piece predistribution descriptor, then directly enter step 246 into described file destination buffer memory;
Step 246, calling system function generic_file_aio_write_nolock () finishes write operation, if in step 245, do not find piece predistribution descriptor into described file destination buffer memory, then system can automatically be that described file destination distributes a piece predistribution descriptor, and with its assignment to the piece predistribution descriptor pointer in the privately owned inode structures of ext3 of described file destination;
Step 247, judge in 245 steps, whether to be that file destination finds a piece predistribution descriptor earlier, if, then the reference count with hash node under the described predistribution descriptor subtracts 1, enter step 248, if in 245 steps, do not distribute descriptor for file destination finds piece to reserve, and by distribution and the assignment in step 246, piece predistribution descriptor pointer in the privately owned inode structures of the ext3 of described file destination is not NULL, piece predistribution descriptor then that piece predistribution descriptor pointer in the privately owned inode structures of the ext3 of file destination is pointed is inserted in the buffer memory, and enters step 248;
Step 248 is changed to NULL with the piece predistribution descriptor pointer of the privately owned index node of ext3 of file destination, in case locking system discharges it;
Step 249 is with the index node release of described file destination.
To achieve these goals, the invention also discloses a kind of distributed file system, comprise storage server end and client;
In the described storage server end, include ext3 local file system, predistribution descriptor administration module and write request processing module;
Described predistribution descriptor administration module is used for when receiving client at the write request of a file destination, for predistribution descriptor of described file destination initialization and management with reclaim described predistribution descriptor;
Described write request processing module be used for according to described client send at a file destination write request, obtain the relevant information of described file destination, and the piece predistribution descriptor of the described file destination of being managed according to described predistribution descriptor administration module, for described write request distribute data piece and carry out write operation;
Described client is used for sending write request to described storage server end.
Preferable, described write request processing module also is used for after write operation is finished, according to the result of said write operation encapsulate response message and with response message send to described write request from client;
Described client also is used to receive the response message that described write request processing module is sent.
Preferable, the institutional framework of described predistribution descriptor is the hash table;
Hash function in the described hash table is the i-number number function to a hash inlet array indexing value by described file destination;
The address that each hash list item of described hash table has write down the piece predistribution descriptor of described file destination, file system under the described file destination the address of superblock, the reference count of hash list item, service time last time of hash list item, and the i-number of described file destination number.
Preferable, in said system, the relevant information of described file destination is i-number number of described file destination.
Preferable, in said system, described write request processing module also is used for, and when the reference count of described hash list item equals 0 and the interval of service time last time and current time during greater than the maximum lifetime of piece predistribution descriptor, is released to the data block that described file destination is reserved.
Preferable, in said system, described predistribution descriptor administration module also is used for, when the reference count of described hash list item equals 0 and the interval of service time last time and current time during greater than the maximum lifetime of piece predistribution descriptor, discharge the storage space of described predistribution descriptor.
Preferable, in said system, described write request processing module also is used for, and when the write operation carried out described file destination, realizes the operation of the following step:
Step 001 locks to the index node of described file destination;
Step 002, the index node by described file destination finds the privately owned index node of described file destination in the ext3 local file system;
Step 003 judges that the piece of the privately owned index node of file destination ext3 distributes whether descriptor pointer is NULL, if be not NULL, enters step 006, if be NULL, then enters step 004;
Step 004 is by the key of described predistribution descriptor of i-number number generation in described hash table of described file destination;
Step 005, in described hash table, search, judged whether to described file destination buffer memory piece predistribution descriptor, be the piece predistribution descriptor of described file destination buffer memory if found, then give the privately owned index node piece of file destination ext3 predistribution descriptor pointer with its assignment, and described distributed the reference count of hash node adds 1 under the descriptor, and do not find piece predistribution descriptor if enter step 006 into described file destination buffer memory, then directly enter step 006;
Step 006, calling system function generic_file_aio_write_nolock () finishes write operation, if in step 005, do not find piece predistribution descriptor into described file destination buffer memory, then system can automatically be that described file destination distributes a piece predistribution descriptor, and with its assignment to the piece predistribution descriptor pointer in the privately owned inode structures of ext3 of described file destination;
Step 007, judge in 005 step, whether to be that file destination finds a piece predistribution descriptor earlier, if, then the reference count with hash node under the described predistribution descriptor subtracts 1, enter step 008, if in 005 step, do not distribute descriptor for file destination finds piece to reserve, and by distribution and the assignment in step 006, piece predistribution descriptor pointer in the privately owned inode structures of the ext3 of described file destination is not NULL, piece predistribution descriptor then that piece predistribution descriptor pointer in the privately owned inode structures of the ext3 of file destination is pointed is inserted in the buffer memory, and enters step 008;
Step 008 is changed to NULL with the piece predistribution descriptor pointer of the privately owned index node of ext3 of file destination, in case locking system discharges it;
Step 009 is with the index node release of described file destination.
The file writing system and the method for distributed file system of the present invention, have following beneficial effect: the present invention can guarantee when the multi-client load needs for separately file allocation data block simultaneously, the storage server end still can have been given play to the advantage that piece is reserved allocation strategy, thereby the data block that guarantees each file destination is deposited on disk as far as possible continuously, performance when this continuous file storage layout both can promote data and writes back can promote the sequential read performance to file again.
Describe the present invention below in conjunction with the drawings and specific embodiments, but not as a limitation of the invention.
Description of drawings
Fig. 1 is the write operation synoptic diagram of the local file system of prior art;
Fig. 2 is the write operation synoptic diagram of the distributed file system of prior art;
The process flow diagram of the file wiring method of Fig. 3 distributed file system of the present invention;
Fig. 4 is the buffer organization structural drawing of predistribution descriptor of the present invention;
Fig. 5 is the structural drawing of distributed file system of the present invention.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer,, the file writing system and the method for distributed file system of the present invention is further elaborated below in conjunction with drawings and Examples.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
The file writing system and the method for distributed file system of the present invention, at the local file system in the storage server end is the distributed file system of ext3 file system, and distributed file system wherein can be the various existing distributed file system that meet above-mentioned requirements.In the present invention, mainly utilize the piece of ext3 file system to reserve the lifting that distribution mechanism realizes the distributed file system write capability.
Please refer to Fig. 3, this is the process flow diagram of the wiring method of distributed file system of the present invention.The wiring method of distributed file system of the present invention may further comprise the steps:
Step S100, distributed file system load store server module, and carry out initialization.
Step S200, a client in the distributed file system is sent write request at a file destination to the storage server end.Comprise the file destination identification information that will visit in the described write request, write request is with respect to the reference position of file destination and the information such as length of write request.
Step S300, described storage server end obtains the filename of described file destination according to the information that comprises in the described write request, opens described file destination, obtain described file destination i-number number.Wherein, i-number number is the i node serial number of file, and it is the unique identification of file in file system.
Step S400, ext3 local file system in described storage server end is after a piece predistribution of the described file destination initialization descriptor, described storage server end is buffered in described predistribution descriptor in the internal memory of described storage server end, described storage server end is that described file destination is reserved corresponding data block according to described predistribution descriptor, described storage server end is after closing described file destination, and the described predistribution descriptor of described file destination continues to be buffered in the described internal memory.
Wherein, the buffer organization structure of described predistribution descriptor as shown in Figure 4.As a kind of embodiment, in an embodiment of the present invention, with hash table cache predistribution descriptor, but not with this as limitation of the invention.
At embodiments of the invention, be used for the hash table of all piece predistribution descriptors of buffer memory, the number of hash inlet item is fixed, the maximum hash list item number that each hash inlet Xiang Suoneng connects is also fixed, use LRU chained list link hash list item in each hash inlet item, above-mentioned employing fixed number and employing LRU chained list link and only are used for example, and be not used in limitation of the invention, in practical operation, the method that also can adopt fixed number not and use structure such as other chained lists to link.Hash function in the described hash table is the i-number number function to a hash inlet array indexing value by described file destination.The address that each hash list item of described hash table has write down the piece predistribution descriptor of described file destination, the address of the superblock of the file system under the described file destination, the reference count of hash list item, service time last time of hash list item, and the i-number of described file destination number.
Step S500, storage server end take out from internal memory and are the piece predistribution descriptor of described file destination buffer memory, and come to be file destination distribute data piece according to described predistribution descriptor, thereby finish the write operation to file destination.
Step S600 encapsulates response message according to the result of said write operation, with response message send to described write request from client.
Step S700 regularly carries out the reclaimer operation of fast predistribution descriptor.When carrying out reclaimer operation, reclaim each hash list item of thread traverses, the numerical value of judging the reference count of described hash list item whether equal 0 and the interval of service time last time and current time whether greater than the maximum lifetime MAX_LIFE_TIME of piece predistribution descriptor, if satisfied above-mentioned two conditions simultaneously, then be released to all data blocks that described file destination is reserved, and discharge the storage space of described hash list item.Wherein, the numerical value of reference count is using the process number of this predistribution descriptor in order to expression.
Step S800, distributed file system unloading storage server modules.
Wherein, among the described step S500, further may further comprise the steps:
Step S501 locks to the index node inode of described file destination.
Step S502, the index node inode by described file destination finds the privately owned index node inode of described file destination in the ext3 local file system.
Step S503 judges whether the piece predistribution descriptor pointer in the privately owned index node inode of the ext3 structure of file destination is NULL, if be not NULL, enters step S506, if be NULL, then enters step S504.
Step S504 is by the key of described predistribution descriptor of i-number number generation in described hash table of described file destination.
Step S505, in described hash table, search, judged whether to described file destination buffer memory piece predistribution descriptor, be the piece predistribution descriptor of described file destination buffer memory if found, then its assignment is given the piece predistribution descriptor pointer in the privately owned inode structure of ext3 of file destination, and the reference count of hash node under the described predistribution descriptor added 1, enter step S506, if do not find piece predistribution descriptor, then directly enter step S506 into described file destination buffer memory.
Step S506, calling system function generic_file_aio_write_nolock () finishes write operation.In the process of carrying out write operation, if in step S505, do not find piece predistribution descriptor into described file destination buffer memory, then system can automatically be that described file destination distributes a piece predistribution descriptor, and with its assignment to the piece predistribution descriptor pointer in the privately owned inode structure of ext3 of described file destination.
Step S507 judges whether be that file destination finds a piece predistribution descriptor earlier, if then the reference count with hash node under the described predistribution descriptor subtracts 1, enters step S508 in the S505 step.If in the S505 step, do not distribute descriptor for file destination finds piece to reserve, and by distribution and the assignment in step S506, piece predistribution descriptor pointer in the privately owned inode structure of the ext3 of described file destination is not NULL, then piece predistribution descriptor pointer piece predistribution descriptor pointed in the privately owned inode structure of the ext3 of file destination is inserted in the buffer memory, and enters step S508.
Step S508 is changed to NULL with the piece predistribution descriptor pointer in the privately owned inode structure of the ext3 of file destination, in case locking system discharges it.
Step S509 is with the inode release of described file destination.
Figure 5 shows that the structural drawing of a kind of distributed file system provided by the present invention.As shown in Figure 5, in the distributed file system 1 of the present invention, comprise storage server end 10 and client 20, the local file system in the described storage server end 10 is an ext3 local file system 100.In the described storage server end 10, include predistribution descriptor administration module 110 and write request processing module 120.
Described predistribution descriptor administration module 110 is used for initialization, management and reclaims the predistribution descriptor.
Described write request processing module 120 is used for handling the write request that the client 20 of described distributed file system 1 is sent, and finishes analysis, writes and operation such as acknowledgement messaging.
When described distributed file system 1 operated, client 20 was sent the write request at a file destination in described storage server end 10.Described write request processing module 120 according to the information that comprises in the described write request, is obtained described file destination filename, opens described file destination, obtains the i-number of described file destination.Described predistribution descriptor administration module 110 is predistribution descriptor of described file destination initialization, and described predistribution descriptor is buffered in the internal memory of storage server end 100.Described ext3 local file system 100 is that described file destination is reserved corresponding data block according to described predistribution descriptor.Described storage server end 10 close described file destination write process after, the described predistribution descriptor of described file destination continues to be buffered in the described internal memory.
Wherein, the buffer organization structure of described predistribution descriptor as shown in Figure 4.In an embodiment of the present invention, with hash table cache predistribution descriptor, but not with this as limitation of the invention.At embodiments of the invention, be used for the hash table of all piece predistribution descriptors of buffer memory, the number of hash inlet item is fixed, the maximum hash list item number that each hash inlet Xiang Suoneng connects is also fixed, use LRU chained list link hash list item in each hash inlet item, above-mentioned employing fixed number and employing LRU chained list link and only are used for example, and be not used in limitation of the invention, in practical operation, the method that also can adopt fixed number not and use structure such as other chained lists to link.Hash function in the described hash table is the i-number number function to a hash inlet array indexing value by described file destination.The address that each hash list item of described hash table has write down the piece predistribution descriptor of described file destination, the address of the superblock of the file system under the described file destination, the reference count of hash list item, service time last time of hash list item, and the i-number of described file destination number.
After write operation was finished, described write request processing module 120 encapsulated response message according to the result of said write operation, and response message is sent to the described client of sending write request.
At last,, thereby need regularly obsolete predistribution descriptor to be discharged, thereby will be the data block release of described file destination reservation owing to the data block of reserving for file destination can't be reallocated to other file.Therefore, described predistribution descriptor administration module 110 also is used for regularly carrying out the reclaimer operation of fast predistribution descriptor.When carrying out reclaimer operation, described predistribution descriptor administration module 110 starts one and reclaims thread, each hash list item of described recovery thread traverses, the reference count of judging described hash list item whether equal 0 and the interval of service time last time and current time whether greater than the maximum lifetime MAX_LIFE_TIME of piece predistribution descriptor, if satisfied above-mentioned two conditions simultaneously, 110 of described predistribution descriptor administration modules are released to all data blocks that described file destination is reserved, and discharge the storage space of described hash list item.Wherein, the numerical value of reference count is using the process number of this predistribution descriptor in order to expression.
In the running of said system, the storage server end takes out from buffer memory and is the piece predistribution descriptor of described file destination buffer memory, and come to be file destination distribute data piece according to it, thereby file destination is finished write operation, wherein, the wiring method of distributed file system is described as the aforementioned for performed content.Therefore, repeat no more herein.
Take the file writing system and the method for distributed file system of the present invention, can be when storage server need be for file destination distribute data piece, directly utilize the predistribution information in the buffer memory to come the distribute data piece, and because each predistribution status information that writes request all is stored in the buffer memory, therefore, each writes the predistribution information of request and can transmit between request, when a plurality of client process are separately file allocation data block simultaneously, the storage server end can have been given play to the benefit of piece predistribution mechanism, thereby guaranteeing that each file destination data can be tried one's best on disk deposits continuously, avoids overlapping.
Certainly; the present invention also can have other various embodiments; under the situation that does not deviate from spirit of the present invention and essence thereof; those of ordinary skill in the art work as can make various corresponding changes and distortion according to the present invention, but these corresponding changes and distortion all should belong to the protection domain of the appended claim of the present invention.
Claims (16)
1. the file wiring method of a distributed file system is characterized in that, may further comprise the steps:
Step 100, distributed file system load store server module, and carry out initialization;
Step 200, the write request that described storage server end sends according to a client in the described distributed file system at a file destination, be utilized as the piece predistribution descriptor of described file destination buffer memory, finish write operation described file destination;
Step 300 encapsulates response message according to the result of said write operation, with response message send to described write request from client.
2. the file wiring method of distributed file system according to claim 1 is characterized in that, in the described step 200, may further comprise the steps:
Step 210, a client in the described distributed file system is sent write request at a file destination to the storage server end;
Step 220, described storage server end obtains the relevant information of described file destination according to the information that comprises in the described write request;
Step 230, described storage server end is according to the file destination information in the client write request, be a piece predistribution of described file destination initialization descriptor, and described predistribution descriptor be buffered in the internal memory of storage server end, the ext3 local file system in the described storage server end is that described file destination is reserved corresponding data block according to described predistribution descriptor;
Step 240, described storage server end is finished the write operation to described file destination;
Step 250 is closed described file destination.
3. the file wiring method of distributed file system according to claim 2 is characterized in that, in the described step 220, may further comprise the steps:
Step 221 according to the information that comprises in the described write request, is resolved the filename that obtains described file destination;
Step 222 is opened described file destination, obtain described file destination i-number number.
4. the file wiring method of distributed file system according to claim 2 is characterized in that, in step 250, after closing described file destination, described predistribution descriptor of described file destination continues to be buffered in the described internal memory.
5. the file wiring method of distributed file system according to claim 2 is characterized in that, the buffer organization structure of described predistribution descriptor is the hash table.
Hash function in the described hash table is the i-number number function to a hash inlet array indexing value by described file destination;
The address that each hash list item of described hash table has write down the piece predistribution descriptor of described file destination, file system under the described file destination the address of superblock, the reference count of hash list item, service time last time of hash list item, and the i-number of described file destination number.
6. the file wiring method of distributed file system according to claim 1, it is characterized in that, also comprise step 400, regular execution reclaimer operation described predistribution descriptor, be released to the data block that described file destination is reserved, and discharge the storage space of described predistribution descriptor.
7. the file wiring method of distributed file system according to claim 6, it is characterized in that, in step 400, the reference count of judging described hash list item whether equal 0 and the interval of service time last time and current time whether greater than the maximum lifetime of piece predistribution descriptor, if satisfied above-mentioned two conditions simultaneously, then be released to the data block that described file destination is reserved, and discharge the storage space of described predistribution descriptor.
8. the file wiring method of distributed file system according to claim 1 is characterized in that, also comprises step 500, the unloading storage server modules.
9. the file wiring method of distributed file system according to claim 2 is characterized in that, in the described step 240, the write operation of finishing described file destination may further comprise the steps:
Step 241 locks to the index node of described file destination;
Step 242, the index node by described file destination finds the privately owned index node of described file destination in the ext3 local file system;
Step 243 judges whether the piece predistribution descriptor pointer of the privately owned index node of ext3 of file destination is NULL, if be not NULL, enters step 246, if be NULL, then enters step 244;
Step 244 is by the key of described predistribution descriptor of i-number number generation in described hash table of described file destination;
Step 245, in described hash table, search, judged whether to described file destination buffer memory piece predistribution descriptor, be the piece predistribution descriptor of described file destination buffer memory if found, then give the privately owned index node piece of file destination ext3 predistribution descriptor pointer, and the reference count of hash node under the described predistribution descriptor is added 1, enter step 246 its assignment, if do not find piece predistribution descriptor, then directly enter step 246 into described file destination buffer memory;
Step 246, calling system function generic_file_aio_write_nolock () finishes write operation, if in step 245, do not find piece predistribution descriptor into described file destination buffer memory, then system can automatically be that described file destination distributes a piece predistribution descriptor, and with its assignment to the piece predistribution descriptor pointer in the privately owned inode structures of ext3 of described file destination;
Step 247, judge in 245 steps, whether to be that file destination finds a piece predistribution descriptor earlier, if, then the reference count with hash node under the described predistribution descriptor subtracts 1, enter step 248, if in 245 steps, do not distribute descriptor for file destination finds piece to reserve, and by distribution and the assignment in step 246, piece predistribution descriptor pointer in the privately owned inode structures of the ext3 of described file destination is not NULL, piece predistribution descriptor then that piece predistribution descriptor pointer in the privately owned inode structures of the ext3 of file destination is pointed is inserted in the buffer memory, and enters step 248;
Step 248 is changed to NULL with the piece predistribution descriptor pointer of the privately owned index node of ext3 of file destination, in case locking system discharges it;
Step 249 is with the index node release of described file destination.
10. a distributed file system is characterized in that, comprises storage server end and client;
In the described storage server end, include ext3 local file system, predistribution descriptor administration module and write request processing module;
Described predistribution descriptor administration module is used for when receiving client at the write request of a file destination, for predistribution descriptor of described file destination initialization and management with reclaim described predistribution descriptor;
Described write request processing module be used for according to described client send at a file destination write request, obtain the relevant information of described file destination, and the piece predistribution descriptor of the described file destination of being managed according to described predistribution descriptor administration module, for described write request distribute data piece and carry out write operation;
Described client is used for sending write request to described storage server end.
11., distributed file system according to claim 10, it is characterized in that described write request processing module also is used for after write operation is finished, according to the result of said write operation encapsulate response message and with response message send to described write request from client;
Described client also is used to receive the response message that described write request processing module is sent.
12. distributed file system according to claim 10 is characterized in that, the institutional framework of described predistribution descriptor is the hash table,
Hash function in the described hash table is the i-number number function to a hash inlet array indexing value by described file destination;
The address that each hash list item of described hash table has write down the piece predistribution descriptor of described file destination, file system under the described file destination the address of superblock, the reference count of hash list item, service time last time of hash list item, and the i-number of described file destination number.
13. distributed file system according to claim 10 is characterized in that, the relevant information of described file destination is i-number number of described file destination.
14. distributed file system according to claim 12, it is characterized in that, described write request processing module also is used for, when the reference count of described hash list item equals 0 and the interval of service time last time and current time during greater than the maximum lifetime of piece predistribution descriptor, be released to the data block that described file destination is reserved.
15. distributed file system according to claim 12, it is characterized in that, described predistribution descriptor administration module also is used for, when the reference count of described hash list item equals 0 and the interval of service time last time and current time during greater than the maximum lifetime of piece predistribution descriptor, discharge the storage space of described predistribution descriptor.
16. distributed file system according to claim 10 is characterized in that, described write request processing module also is used for, and when the write operation carried out described file destination, realizes the operation of the following step:
Step 001 locks to the index node of described file destination;
Step 002, the index node by described file destination finds the privately owned index node of described file destination in the ext3 local file system;
Step 003 judges that the piece of the privately owned index node of file destination ext3 distributes whether descriptor pointer is NULL, if be not NULL, enters step 006, if be NULL, then enters step 004;
Step 004 is by the key of described predistribution descriptor of i-number number generation in described hash table of described file destination;
Step 005, in described hash table, search, judged whether to described file destination buffer memory piece predistribution descriptor, be the piece predistribution descriptor of described file destination buffer memory if found, then give the privately owned index node piece of file destination ext3 predistribution descriptor pointer with its assignment, and described distributed the reference count of hash node adds 1 under the descriptor, and do not find piece predistribution descriptor if enter step 006 into described file destination buffer memory, then directly enter step 006;
Step 006, calling system function generic_file_aio_write_nolock () finishes write operation, if in step 005, do not find piece predistribution descriptor into described file destination buffer memory, then system can automatically be that described file destination distributes a piece predistribution descriptor, and with its assignment to the piece predistribution descriptor pointer in the privately owned inode structures of ext3 of described file destination;
Step 007, judge in 005 step, whether to be that file destination finds a piece predistribution descriptor earlier, if, then the reference count with hash node under the described predistribution descriptor subtracts 1, enter step 008, if in 005 step, do not distribute descriptor for file destination finds piece to reserve, and by distribution and the assignment in step 006, piece predistribution descriptor pointer in the privately owned inode structures of the ext3 of described file destination is not NULL, piece predistribution descriptor then that piece predistribution descriptor pointer in the privately owned inode structures of the ext3 of file destination is pointed is inserted in the buffer memory, and enters step 008;
Step 008 is changed to NULL with the piece predistribution descriptor pointer of the privately owned index node of ext3 of file destination, in case locking system discharges it;
Step 009 is with the index node release of described file destination.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2007101679005A CN100517335C (en) | 2007-10-25 | 2007-10-25 | Distributed file system file writing system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2007101679005A CN100517335C (en) | 2007-10-25 | 2007-10-25 | Distributed file system file writing system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101149755A true CN101149755A (en) | 2008-03-26 |
CN100517335C CN100517335C (en) | 2009-07-22 |
Family
ID=39250281
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2007101679005A Expired - Fee Related CN100517335C (en) | 2007-10-25 | 2007-10-25 | Distributed file system file writing system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100517335C (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102073739A (en) * | 2011-01-25 | 2011-05-25 | 中国科学院计算技术研究所 | Method for reading and writing data in distributed file system with snapshot function |
CN101452402B (en) * | 2008-11-28 | 2012-05-30 | 珠海金山快快科技有限公司 | Software operation system and software operation method |
CN102622412A (en) * | 2011-11-28 | 2012-08-01 | 中兴通讯股份有限公司 | Method and device of concurrent writes for distributed file system |
CN101706802B (en) * | 2009-11-24 | 2013-06-05 | 成都市华为赛门铁克科技有限公司 | Method, device and sever for writing, modifying and restoring data |
CN103294704A (en) * | 2012-02-28 | 2013-09-11 | 鸿富锦精密工业(深圳)有限公司 | File synchronous system and method |
CN103514298A (en) * | 2013-10-16 | 2014-01-15 | 浪潮(北京)电子信息产业有限公司 | Method for achieving file lock and metadata server |
CN103516812A (en) * | 2013-10-22 | 2014-01-15 | 浪潮电子信息产业股份有限公司 | Method for accelerating cloud storage internal data transmission |
CN103559231A (en) * | 2013-10-23 | 2014-02-05 | 华为技术有限公司 | File system quota managing method, device and system |
CN104166520A (en) * | 2013-05-20 | 2014-11-26 | 深圳先进技术研究院 | Distributed hard disk system and data migration method thereof |
CN104573428A (en) * | 2013-10-12 | 2015-04-29 | 方正宽带网络服务股份有限公司 | Method and system for improving resource effectiveness of server cluster |
WO2018019255A1 (en) * | 2016-07-26 | 2018-02-01 | 中兴通讯股份有限公司 | File writing method and device |
CN108881107A (en) * | 2017-05-09 | 2018-11-23 | 腾讯科技(深圳)有限公司 | A kind of distributed resource dispensing method, apparatus and system |
CN109977079A (en) * | 2019-04-01 | 2019-07-05 | 江苏汇智达信息科技有限公司 | A kind of data processing method and device based on distributed file system |
US11093532B2 (en) | 2017-05-25 | 2021-08-17 | International Business Machines Corporation | Pre-allocating filesystem metadata within an object storage system |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011147073A1 (en) * | 2010-05-24 | 2011-12-01 | 中兴通讯股份有限公司 | Data processing method and device in distributed file system |
CN104809124B (en) * | 2014-01-24 | 2018-06-26 | 中国移动通信集团河北有限公司 | Cloud Virtual File System and its input/output request processing method |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6185665B1 (en) * | 1997-02-28 | 2001-02-06 | Matsushita Electric Industrial Co., Ltd. | File management apparatus, file management method, and recording medium containing file management program |
JP2001266259A (en) * | 2000-03-15 | 2001-09-28 | Sharp Corp | Transaction processor |
US8458238B2 (en) * | 2004-10-26 | 2013-06-04 | Netapp, Inc. | Method and system for efficient write journal entry management for a distributed file system |
CN1277213C (en) * | 2004-12-31 | 2006-09-27 | 大唐微电子技术有限公司 | A flash memory file system management method |
KR100703753B1 (en) * | 2005-04-14 | 2007-04-06 | 삼성전자주식회사 | Apparatus and method for managing file system |
-
2007
- 2007-10-25 CN CNB2007101679005A patent/CN100517335C/en not_active Expired - Fee Related
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452402B (en) * | 2008-11-28 | 2012-05-30 | 珠海金山快快科技有限公司 | Software operation system and software operation method |
CN101706802B (en) * | 2009-11-24 | 2013-06-05 | 成都市华为赛门铁克科技有限公司 | Method, device and sever for writing, modifying and restoring data |
CN102073739A (en) * | 2011-01-25 | 2011-05-25 | 中国科学院计算技术研究所 | Method for reading and writing data in distributed file system with snapshot function |
CN102622412A (en) * | 2011-11-28 | 2012-08-01 | 中兴通讯股份有限公司 | Method and device of concurrent writes for distributed file system |
CN103294704A (en) * | 2012-02-28 | 2013-09-11 | 鸿富锦精密工业(深圳)有限公司 | File synchronous system and method |
CN104166520A (en) * | 2013-05-20 | 2014-11-26 | 深圳先进技术研究院 | Distributed hard disk system and data migration method thereof |
CN104166520B (en) * | 2013-05-20 | 2019-01-11 | 深圳先进技术研究院 | Distributed hard-disk system and wherein carry out Data Migration method |
CN104573428A (en) * | 2013-10-12 | 2015-04-29 | 方正宽带网络服务股份有限公司 | Method and system for improving resource effectiveness of server cluster |
CN104573428B (en) * | 2013-10-12 | 2018-02-13 | 方正宽带网络服务股份有限公司 | A kind of method and system for improving server cluster resource availability |
CN103514298A (en) * | 2013-10-16 | 2014-01-15 | 浪潮(北京)电子信息产业有限公司 | Method for achieving file lock and metadata server |
CN103516812A (en) * | 2013-10-22 | 2014-01-15 | 浪潮电子信息产业股份有限公司 | Method for accelerating cloud storage internal data transmission |
CN103559231A (en) * | 2013-10-23 | 2014-02-05 | 华为技术有限公司 | File system quota managing method, device and system |
WO2018019255A1 (en) * | 2016-07-26 | 2018-02-01 | 中兴通讯股份有限公司 | File writing method and device |
CN108881107A (en) * | 2017-05-09 | 2018-11-23 | 腾讯科技(深圳)有限公司 | A kind of distributed resource dispensing method, apparatus and system |
US11093532B2 (en) | 2017-05-25 | 2021-08-17 | International Business Machines Corporation | Pre-allocating filesystem metadata within an object storage system |
CN109977079A (en) * | 2019-04-01 | 2019-07-05 | 江苏汇智达信息科技有限公司 | A kind of data processing method and device based on distributed file system |
CN109977079B (en) * | 2019-04-01 | 2021-10-26 | 泰州清润环保科技有限公司 | Data processing method and device based on distributed file system |
Also Published As
Publication number | Publication date |
---|---|
CN100517335C (en) | 2009-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100517335C (en) | Distributed file system file writing system and method | |
JP5142995B2 (en) | Memory page management | |
CN110555001B (en) | Data processing method, device, terminal and medium | |
CN107544756B (en) | Key-Value log type local storage method based on SCM | |
CN110858162B (en) | Memory management method and device and server | |
CN102063406A (en) | Network shared Cache for multi-core processor and directory control method thereof | |
CN107256196A (en) | The caching system and method for support zero-copy based on flash array | |
CN100424699C (en) | Attribute extensible object file system | |
CN111984191A (en) | Multi-client caching method and system supporting distributed storage | |
Lee et al. | Metadata management of the SANtopia file system | |
Rumble | Memory and object management in RAMCloud | |
CN108089825A (en) | A kind of storage system based on distributed type assemblies | |
US20160012075A1 (en) | Computer system and data management method | |
CN109960662A (en) | A kind of method for recovering internal storage and equipment | |
Li et al. | Enabling efficient updates in KV storage via hashing: Design and performance evaluation | |
JPWO2004036432A1 (en) | Database accelerator | |
CN116894041B (en) | Data storage method, device, computer equipment and medium | |
CA2415018C (en) | Adaptive parallel data clustering when loading a data structure containing data clustered along one or more dimensions | |
WO2024197789A1 (en) | Fine-grained file system and file reading and writing method | |
CN114785662B (en) | Storage management method, device, equipment and machine-readable storage medium | |
CN115794368A (en) | Service system, memory management method and device | |
CN100395730C (en) | Data source based virtual memory processing method | |
CN1333346C (en) | Method for accessing files | |
US7085888B2 (en) | Increasing memory locality of filesystem synchronization operations | |
US6728854B2 (en) | System and method for providing transaction management for a data storage space |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20090722 Termination date: 20191025 |
|
CF01 | Termination of patent right due to non-payment of annual fee |