CN103605726B - A kind of access method of small documents, system and control node and memory node - Google Patents

A kind of access method of small documents, system and control node and memory node Download PDF

Info

Publication number
CN103605726B
CN103605726B CN201310575327.7A CN201310575327A CN103605726B CN 103605726 B CN103605726 B CN 103605726B CN 201310575327 A CN201310575327 A CN 201310575327A CN 103605726 B CN103605726 B CN 103605726B
Authority
CN
China
Prior art keywords
small documents
files
blocks
memory node
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310575327.7A
Other languages
Chinese (zh)
Other versions
CN103605726A (en
Inventor
徐君
倪涛
郭家栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhicheng Jianke Design Co., Ltd
Original Assignee
China Security and Fire Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Security and Fire Technology Co Ltd filed Critical China Security and Fire Technology Co Ltd
Priority to CN201310575327.7A priority Critical patent/CN103605726B/en
Publication of CN103605726A publication Critical patent/CN103605726A/en
Application granted granted Critical
Publication of CN103605726B publication Critical patent/CN103605726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention is applied to cloud storage technical field, there is provided a kind of access method of small documents, system and control node and memory node, methods described include:The small documents order that creates that control node is sent according to client creates a new aggregate file or one existing aggregate file of selection from aggregate file list, the sequence number of small documents is first generated according to the attribute information of the aggregate file simultaneously, regenerate the mark of small documents and the information of the memory node for storing small documents, client sends to corresponding memory node according to the information of the memory node for storing small documents and writes small documents instruction, the data of small documents are write by corresponding blocks of files according to the instruction by memory node, and original position of the data of small documents in blocks of files and the length of small documents are write to the index region of blocks of files.The performance when present invention, subsequently supporting the functions such as modification for small documents, deletion, locking, unblock, and can lift retrieval mass small documents.

Description

A kind of access method of small documents, system and control node and memory node
Technical field
The invention belongs to cloud storage technical field, more particularly to a kind of access method of small documents, system and control node And memory node.
Background technology
With cloud computing, the broad development of cloud storage technology and application, magnanimity (tens) small documents (tens how are stored KB to several MB picture, document etc.) turn into the problem that distributed file system is applied.
Existing distributed file system, such as Hadoop distributed file systems(Hadoop Distributed File System, HDFS)There is the defects of huge when storing mass small documents, due to HDFS namenode(Host node)Internal memory Limitation, usual 100w file need at least 3GB memory headroom, and therefore, HDFS can not store and manage the text of 1,000,000,000 ranks Part.HDFS also provides HAR FILE and Sequence FILE pattern to alleviate the problem, but both storage modes are all By specifically ordering, stored file is converted into both storage modes, the problem of due to storage organization, Wu Fazhi The modification for small documents is held, is deleted, performance when locking, unlocking function, while retrieving mass small documents also has to be hoisted.
The content of the invention
The embodiments of the invention provide a kind of access method of small documents, system and control node and memory node, it is intended to Solve the access method for the file that prior art provides, the work(such as modification for small documents, deletion, locking, unblock can not be supported Can, and performance when retrieving mass small documents also has the problem of to be hoisted.
On the one hand, there is provided a kind of access method of small documents, methods described include:
Control node receives the establishment small documents order that client is sent;
Control node creates a new aggregate file or from aggregate file list according to the establishment small documents order One existing aggregate file of middle selection, while the small documents are generated described poly- according to the attribute information of the aggregate file Close the sequence number in file;
Sequence of the control node according to the attribute information and the small documents of the aggregate file in the aggregate file Number mark of the generation small documents and the information of the memory node for storing the small documents, and send to client, Client sends to corresponding memory node according to the information of the memory node for storing the small documents and writes small documents instruction, Small documents instruction is write according to as the memory node data of the small documents are write into corresponding blocks of files, and by described in The length of original position and the small documents of the data of small documents in blocks of files writes the index region of the blocks of files.
Further, the attribute information of the aggregate file includes the mark of aggregate file and the memory block of aggregate file List, the memory block list include the small documents stored in the sequence number of each blocks of files in aggregate file, each blocks of files Number and each blocks of files of storage memory node information.
Further, the control node is gathered according to the attribute information and the small documents of the aggregate file described Closing the mark of small documents and the information of the memory node for storing the small documents described in the serial number gencration in file includes:
Control node is given birth to according to sequence number of the mark and the small documents of the aggregate file in the aggregate file Into the mark of the small documents;
Control node is small according to being stored in sequence number of the small documents in the aggregate file and each blocks of files The number of file, the sequence number of the blocks of files for storing the small documents is calculated;
Control node is used to store the sequence number of the blocks of files of the small documents and for storing each file according to described The information of the memory node of block obtains the information of the memory node for storing the small documents.
Further, the memory node writes small documents instruction according to and writes the data of the small documents accordingly Blocks of files, and original position of the data of the small documents in blocks of files and the length of the small documents are write into the file The index region of block includes:
Memory node extracts the marks of the small documents from described write in small documents instruction;
Memory node obtains storing mark and the institute of the aggregate file of the small documents according to the mark of the small documents State sequence number of the small documents in the aggregate file;
Memory node is small according to being stored in sequence number of the small documents in the aggregate file and each blocks of files The number of file, the sequence number of the blocks of files for storing the small documents is calculated;
The data of the small documents are write the blocks of files by memory node, and by the data of the small documents in blocks of files In original position and the length of the small documents write the index region of the blocks of files.
Further, in the control node according to the attribute information and the small documents of the aggregate file described The information of the mark of small documents described in serial number gencration in aggregate file and the memory node for storing the small documents, and Send to client, client and sent according to the information of the memory node for storing the small documents to corresponding memory node Small documents instruction is write, small documents instruction is write according to as the memory node writes corresponding text by the data of the small documents Part block, and original position of the data of the small documents in blocks of files and the length of the small documents are write into the blocks of files Index region after, methods described also includes:
Control node receives the modification small documents instruction that client is sent;
Control node instructs to obtain the information of the memory node for storing small documents according to the modification small documents, concurrently Client is given, client is small to the transmission modification of corresponding memory node according to the information of the memory node for storing small documents File instruction, by memory node according to the modification small documents instruction at the end for the blocks of files for storing the small documents The modification data of the small documents are inserted, while in the index region of the blocks of files, by the data of the small documents in text Original position in part block is updated to the start memory location of the modification data, and the length of the small documents is updated into institute State the length of modification data.
Further, in the control node according to the attribute information and the small documents of the aggregate file described The information of the mark of small documents described in serial number gencration in aggregate file and the memory node for storing the small documents, and Send to client, client and sent according to the information of the memory node for storing the small documents to corresponding memory node Small documents instruction is write, small documents instruction is write according to as the memory node writes corresponding text by the data of the small documents Part block, and original position of the data of the small documents in blocks of files and the length of the small documents are write into the blocks of files Index region after, methods described also includes:
Control node receives the reading small documents instruction that client is sent;
Control node instructs to obtain the information of the memory node for storing small documents according to the reading small documents, concurrently Client is given, client reads small according to the information of the memory node for storing small documents to the transmission of corresponding memory node File instruction, the rope of the small documents stored in blocks of files is first found according to the reading small documents instruction by memory node Draw, then read the original position and length information of the small documents stored in index, finally according to the original position and length Data corresponding to information reading.
Further, in the control node according to the attribute information and the small documents of the aggregate file described The information of the mark of small documents described in serial number gencration in aggregate file and the memory node for storing the small documents, and Send to client, client and sent according to the information of the memory node for storing the small documents to corresponding memory node Small documents instruction is write, small documents instruction is write according to as the memory node writes corresponding text by the data of the small documents Part block, and original position of the data of the small documents in blocks of files and the length of the small documents are write into the blocks of files Index region after, methods described also includes:
Control node receives the deletion small documents instruction that client is sent;
Control node instructs to obtain the information of the memory node for storing small documents according to the deletion small documents, concurrently Client is given, client is deleted small according to the information of the memory node for storing small documents to the transmission of corresponding memory node File instruction, the index of the small documents stored in blocks of files is first found according to deletion small documents instruction by memory node, It is deletion state by the small documents status modifier in index.
Further, instruct to obtain the storage for storing small documents according to the deletion small documents in the control node The information of node, and be sent to client, client is according to the information of the memory node for storing small documents to depositing accordingly Store up node and send deletion small documents instruction, first find what is stored in blocks of files according to deletion small documents instruction by memory node The index of the small documents, by the small documents status modifier in index for after deletion state, methods described also includes:
Valid data ratio in the All Files block of control node statistics and convergence file, is less than for valid data ratio To the blocks of files of fixed-ratio, the information of the memory node of reading storage this document block;
Control node sends garbage reclamation and liquidation file block command to corresponding memory node, by corresponding memory node All valid data are merged according to the order, generate new aggregate file.
On the other hand, there is provided a kind of control node, the control node include:
Establishment file instruction reception unit, for receiving the establishment small documents order of client transmission;
Small documents serial number gencration unit, for according to the establishment small documents order create a new aggregate file or An existing aggregate file is selected from aggregate file list, while according to the generation of the attribute information of the aggregate file Sequence number of the small documents in the aggregate file;
Mark and memory node information generating unit, for the attribute information according to the aggregate file and the small text The mark of small documents described in serial number gencration of the part in the aggregate file and the memory node for storing the small documents Information, and send to client, client is according to the information of the memory node for being used to store the small documents to depositing accordingly Storage node, which is sent, writes small documents instruction, and small documents instruction is write according to as the memory node and writes the data of the small documents Enter corresponding blocks of files, and the length of original position of the data of the small documents in blocks of files and the small documents is write The index region of the blocks of files.
Further, the attribute information of the aggregate file includes the mark of aggregate file and the memory block of aggregate file List, the memory block list include the small documents stored in the sequence number of each blocks of files in aggregate file, each blocks of files Number and each blocks of files of storage memory node information.
Further, the mark and memory node information generating unit include:
Small documents identifier generation module, for the mark according to the aggregate file and the small documents in the polymerization The mark of small documents described in serial number gencration in file;
Blocks of files serial number gencration module, for according to sequence number of the small documents in the aggregate file and each text The number of the small documents stored in part block, the sequence number of the blocks of files for storing the small documents is calculated;
Memory node data obtaining module, for according to the sequence number for being used to store the blocks of files of the small documents and The information of memory node for storing each blocks of files obtains the information of the memory node for storing the small documents.
Further, also include in the control node:
Small documents instruction reception unit is changed, for receiving the modification small documents instruction of client transmission;
Small documents change unit, for instructing to obtain the memory node for storing small documents according to the modification small documents Information, and be sent to client, client is according to the information of the memory node for storing small documents to corresponding storage section Point sends modification small documents instruction, by memory node according to the modification small documents instruction for storing the small documents The modification data of the small documents are inserted at the end of blocks of files, while in the index region of the blocks of files, by the small text Original position of the data of part in blocks of files is updated to the start memory location of the modification data, and by the small documents Length is updated to the length of the modification data;
Small documents instruction reception unit is read, for receiving the reading small documents instruction of client transmission;
Small documents reading unit, for instructing to obtain the memory node for storing small documents according to the reading small documents Information, and be sent to client, client is according to the information of the memory node for storing small documents to corresponding storage section Point send read small documents instruction, by memory node according to it is described reading small documents instruction first find stored in blocks of files should The index of small documents, then the original position and length information of the small documents stored in index are read, finally according to the starting Data corresponding to position and length information reading;
Small documents instruction reception unit is deleted, for receiving the deletion small documents instruction of client transmission;
Small documents delete unit, for instructing to obtain the memory node for storing small documents according to the deletion small documents Information, and be sent to client, client is according to the information of the memory node for storing small documents to corresponding storage section Point, which is sent, deletes small documents instruction, and it is small according to deletion small documents instruction first to find this that stored in blocks of files by memory node The index of file, it is deletion state by the small documents status modifier in index;
Memory node Information reading unit, it is right for the valid data ratio in the All Files block of statistics and convergence file In valid data ratio less than the blocks of files to fixed-ratio, the information of the memory node of reading storage this document block;
New aggregate file generation unit, for sending garbage reclamation and liquidation blocks of files life to corresponding memory node Order, all valid data are merged, generate new aggregate file by corresponding memory node according to the order.
Another further aspect, there is provided a kind of memory node, the memory node include:
Small documents instruction reception unit is write, the small documents of writing for receiving client transmission instruct;
Data write unit, the data of the small documents are write into corresponding file for writing small documents instruction according to Block, and original position of the data of the small documents in blocks of files and the length of the small documents are write into the blocks of files Index region;
Further, the data write unit includes:
The identifier acquisition module of small documents, for extracting the marks of the small documents from described write in small documents instruction;
Small documents sequence number acquisition module, for obtaining storing the polymerization text of the small documents according to the mark of the small documents The sequence number of the mark of part and the small documents in the aggregate file;
Blocks of files sequence number acquisition module, for according to sequence number of the small documents in the aggregate file and each text The number of the small documents stored in part block, the sequence number of the blocks of files for storing the small documents is calculated;
Data write. module, for the data of the small documents to be write into the blocks of files, and by the number of the small documents The index region of the blocks of files is write according to the original position in blocks of files and the length of the small documents.
Further, the memory node also includes:
Small documents instruction reception unit is changed, for receiving the modification small documents instruction of client transmission;
File modification unit, for being instructed according to the modification small documents for storing the blocks of files of the small documents End insert the modification data of the small documents, while in the index region of the blocks of files, by the number of the small documents The start memory location of the modification data is updated to according to the original position in blocks of files, and by the length of the small documents more It is newly the length of the modification data;
Small documents instruction reception unit is read, for receiving the reading small documents instruction of client transmission;
Small documents reading unit, for first finding the small text stored in blocks of files according to the reading small documents instruction The index of part, then the original position and length information of the small documents stored in index are read, finally according to the original position With length information read corresponding to data;
Small documents instruction reception unit is deleted, for receiving the deletion small documents instruction of client transmission;
Small documents delete unit, for first finding the small text stored in blocks of files according to the deletion small documents instruction The index of part, it is deletion state by the small documents status modifier in index;
Garbage reclamation and liquidation file block command receiving unit, for receiving garbage reclamation and the liquidation of control node transmission File block command;
New aggregate file generation unit, for merging all valid data according to the order, generate new polymerization File.
Another further aspect, there is provided a kind of access system of small documents, the system include client, control section as described above Point and memory node as described above, the control node is connected with least one memory node, the client respectively with institute Control node is stated to connect with the memory node.
In the embodiment of the present invention, small documents are stored in the blocks of files of aggregate file, the indexes of small documents is stored in poly- In the blocks of files for closing file, indexed without outside, the index and data of small documents are a blocks of files of the fusion in aggregate file In, the characteristic such as data consistency that cloud storage system is brought, fault-tolerance perfect can embody, using simple, conveniently;And if adopted With the mode of outside index, because the index file and data file of small documents are stored separately, then one must be additionally related to Individual external program, the characteristics such as the uniformity of index file and data file are ensured by the external program.In addition, small documents Mark can carry out unique mark with the mark of aggregate file with sequence number of the small documents in aggregate file, subsequently, to small text When part is modified, deleted, locking, unlocking or accessing small documents, directly it can be identified by the small documents to obtain small text Blocks of files where part, and then can be realized by carrying out operation to this document block.
Brief description of the drawings
Fig. 1 is the implementation process figure of the access method for the small documents that the embodiment of the present invention one provides;
Fig. 2 is the overall structure diagram for the whole file that the embodiment of the present invention one provides;
Fig. 3 is the structural representation for the blocks of files that the embodiment of the present invention one provides;
Fig. 4 is the implementation process figure of the amending method for the small documents that the embodiment of the present invention two provides;
Fig. 5 is the implementation process figure of the access method for the small documents that the embodiment of the present invention three provides;
Fig. 6 is the implementation process figure of the delet method for the small documents that the embodiment of the present invention four provides;
Fig. 7 be the embodiment of the present invention two provide modification small documents after blocks of files structure change schematic diagram;
Fig. 8 is the implementation process figure of the liquidation method for the aggregate file that the embodiment of the present invention five provides;
Fig. 9 is the concrete structure block diagram for the control node that the embodiment of the present invention six provides;
Figure 10 is the concrete structure block diagram for the memory node that the embodiment of the present invention seven provides;
Figure 11 is the structured flowchart of the access system for the small documents that the embodiment of the present invention eight provides.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
In embodiments of the present invention, small documents are stored in the blocks of files of aggregate file, the index of small documents is stored in In the blocks of files of aggregate file, indexed without outside, the index and data of small documents are a file of the fusion in aggregate file In block, the characteristic such as data consistency that cloud storage system is brought, fault-tolerance perfect can embody, using simple, conveniently;And if By the way of being indexed using outside, because the index file and data file of small documents are stored separately, then it must be additionally related to One external program, the characteristics such as the uniformity of index file and data file are ensured by the external program.
It is described in detail below in conjunction with realization of the specific embodiment to the present invention:
Embodiment one
Fig. 1 shows the implementation process of the access method for the small documents that the embodiment of the present invention one provides, with control node side Exemplified by illustrate, details are as follows:
In step S101, control node receives the establishment small documents order that client is sent.
In embodiments of the present invention, control node receives the establishment small documents order that client is sent, and the order includes The number of copies of the small documents, according to the number of copies, control node can be by the small documents distributed storage in replicating part In several memory nodes of several identicals at least two.
It should be noted that the storage method for the file that the present embodiment provides is applied to distributed file system (Sky Distributed File System, SDFS), the distributed file system includes control node and memory node two parts, One control node can connect at least one memory node.Wherein, the attribute information of control node storage All Files, storage Each blocks of files in node is used for the data message for depositing aggregate file, and small documents are stored in poly- by the merging of specific structure Close in file.The overall structure of whole file as shown in Fig. 2 wherein, control node includes the attribute information of aggregate file, than Such as the file identification id, the blocks of files list BlockSize of aggregate file and the file type of aggregate file of aggregate file Type, specifically, BlockSize includes blocks of files block1, block2 etc., until blockn;Stored in memory node 1 There are block1, block2, the small documents stored in wherein block1 are Xxx.rar, and the small documents stored in block2 are Xxx.doc;Blocks of files blockn is stored in memory node 2, and the small documents stored in blockn are Xxx.jpg.
In step s 102, control node according to receive create small documents orders create a new aggregate file or Person selects an existing aggregate file from aggregate file list, while generates small text according to the attribute information of the aggregate file Sequence number of the part in the aggregate file.
In embodiments of the present invention, an aggregate file list is stored with control node, preserves deposit in the list The file identification id of each aggregate file stored in storage node.
If the list is sky, control node creates a new aggregate file, and generates the new aggregate file Attribute information, while serial number 1 of the small documents in the new aggregate file is set, it is of course also possible to be 0, do not do herein Limitation.
If the list does not select an existing aggregate file from the list, set simultaneously for sky, control node The sequence number of stored small documents in serial number of the small documents in the aggregate file aggregate file adds 1.
Specifically, the number for the blocks of files being stored with the attribute information of aggregate file in aggregate file and each file The number of the small documents stored in block, it is that can obtain the number of the small documents in aggregate file according to these information, then obtains The sequence number for the small documents that will be created.Certainly, small documents sequence maximum in current aggregate file can also be included in attribute information Number, the current maximum small documents sequence number of serial number for the small documents that will be created adds 1, and concrete mode no longer limits herein.
Specifically, being stored with the attribute information of aggregate file in control node, attribute information includes each aggregate file File identification id, the blocks of files list BlockSize of aggregate file and the file type Type of aggregate file.
Wherein, the file identification id of aggregate file is used for one aggregate file of unique mark, passes through the mark of aggregate file Corresponding aggregate file can be got;
BlockSize includes the small documents for identifying, being stored in each blocks of files of each blocks of files in aggregate file Number and each blocks of files of storage memory node information;
Type is used for the type for identifying aggregate file, carries out the mark of file type, " 1 " generation in the present embodiment with 3bit Table this document is aggregate file.
Small documents do not have single file attribute, but have unique 64bit file identification id, this document mark id bags Include aggregate file id signs, the mark id of aggregate file and small documents sequence number(Sequence number of the small documents in aggregate file)Three Point, it is as follows:
000|0000…0000|00…00
3bit 44bit 17bit
Wherein, 3bit aggregate file id signs, " 1 " represent the file identification id of the 64bit as aggregate file id, " 0 " The file identification id of the 64bit is represented as the mark id of small documents;
The id of aggregate files of the 44bit (6byte) where small documents;
17bit represents sequence number of the small documents in aggregate file.
Wherein, in memory node, the structure of specific blocks of files is different, and the specific blocks of files of ordinary file is Certain segment data of this document, due to including multiple small documents in each blocks of files of aggregate file, it is therefore desirable to take certain Structural support reads some small documents in this document block, and the structure of blocks of files may refer to Fig. 3, including index segment data and text Number of packages is according to two parts.
Wherein, index segment data include Head and index region, and index region is small for each in storage file block Indexed corresponding to file, a small documents uniquely index for corresponding one.
Head(4byte):The number of the index stored in region is indexed for depositing;
Indexing the size in region is:The size that the number * of the index stored in Head is each indexed(10byte).
When creating the blocks of files of aggregate file, first a corresponding rope is established for each small documents in blocks of files Draw, comprising the small documents starting offset (5byte, blocks of files maximum 1TB) in blocks of files in each index, small documents length Length (4byte, single small documents maximum 4G) is spent, can also including small documents state certainly, in index, (1byte, mark are small The file status such as whether file is deleted, locking), although unreceipted in figure 3.
Pass through sequence number of the small documents in aggregate file(Abbreviation small documents sequence number)It is corresponding that small documents can directly be calculated Index position, specific formula for calculation is as follows:
The size that the number * of the small documents stored in the position of index=small documents sequence number/each blocks of files is each indexed (10byte)+4 (numbers of currently stored small documents in head).
Index segment data storage afterwards is file data, because the size of the small documents stored in each blocks of files is not Fixed, and the number of the small documents stored in a blocks of files is fixed, therefore each blocks of files of aggregate file Size is unfixed.
In step s 103, sequence of the control node according to the attribute information and small documents of aggregate file in aggregate file Number mark of generation small documents and the information of the memory node for storing small documents, and send to client, client root Sent according to the information of the memory node for storing small documents to corresponding memory node and write small documents instruction, by memory node root The data of small documents are write into corresponding blocks of files, and the rising in blocks of files by the data of the small documents according to small documents instruction is write Beginning position and the length of the small documents write the index region of this document block.
In embodiments of the present invention, control node obtained by step S102 the mark of the aggregate file where small documents with And after sequence number of the small documents in the aggregate file, small documents can obtain according to the structure of the file identification of small documents Mark.
Storing the information of the memory node of small documents can be obtained by following step:
Step 1, control node are according to the small text stored in sequence number of the small documents in aggregate file and each blocks of files The number of part, the sequence number of the blocks of files for storing small documents is calculated.
Step 2, control node are according to the sequence number of the blocks of files for storing small documents and for storing each blocks of files The information of memory node obtain the information of the memory node for storing the small documents.
After control node obtains the mark of small documents and the information of the memory node for storing small documents, these are sent Information is to client.
Client is write according to the information therein for being used to store the memory node of small documents to the transmission of corresponding memory node Small documents instruct, memory node receive client hair send write small documents instruction after, first from this write small documents instruction in carry Take out small documents mark, further according to small documents mark obtain storing the small documents aggregate file mark and the small text Sequence number of the part in aggregate file, then according to sequence number of the small documents in aggregate file and each text in aggregate file The number of the small documents stored in part block, the sequence number of the blocks of files for storing the small documents is calculated, finally by the small text Blocks of files corresponding to the data write-in of part, and by original position offset of the data of the small documents in blocks of files and the small text The length length of part writes the index region of this document block.
Wherein, the number of the sequence number of blocks of files=small documents sequence number/each blocks of files storage small documents.
As a preferred embodiment of the present invention, the small documents of writing that client transmission is received in memory node instruct it Before, memory node can also receive the distribution blocks of files operating right instruction of control node transmission, and file is distributed to memory node Block operating right, in the present embodiment, the blocks of files operating right of distribution is written document authority.
The present embodiment, small documents are stored in the blocks of files of aggregate file, the index of small documents is stored in aggregate file Blocks of files in, indexed without outside, the index and data of small documents be fusion in a blocks of files of aggregate file, by cloud The characteristics such as data consistency that storage system is brought, fault-tolerance perfect can embody, using simple, conveniently;And if used outside The mode of index, because the index file and data file of small documents are stored separately, then it must be additionally related to an outside Program, the characteristics such as the uniformity of index file and data file are ensured by the external program.
In addition, the mark of small documents can be carried out with the sequence number of the mark of aggregate file and small documents in aggregate file Unique mark, can be directly by this when subsequently, modifying, delete to small documents, locking, unlocking or accessing small documents Small documents are identified to obtain the blocks of files where small documents, and then can be realized by carrying out operation to this document block.
Can one of ordinary skill in the art will appreciate that realizing that all or part of step in the various embodiments described above method is To instruct the hardware of correlation to complete by program, corresponding program can be stored in a computer read/write memory medium In, described storage medium, such as ROM/RAM, disk or CD.
Embodiment two
Fig. 4 shows the implementation process of the amending method for the small documents that the embodiment of the present invention two provides, with control node side Exemplified by illustrate, details are as follows:
Step in the present embodiment is performed after the step S103 in embodiment one.
In step S401, control node receives the modification small documents instruction that client is sent.
In embodiments of the present invention, control node receives the modification small documents instruction that client is sent, and the instruction includes The file identification for the small documents that will be changed.
In step S402, control node instructs to obtain for storing depositing for small documents according to the modification small documents received The information of node is stored up, and is sent to client, client is according to the information of the memory node for being used to store small documents to corresponding Memory node sends modification small documents instruction, is instructed by memory node according to the modification small documents for storing the small documents The end of blocks of files insert the modification data of the small documents, while in the index region of this document block, by the small documents Original position of the data in blocks of files is updated to the start memory location of the modification data, and the length of the small documents is updated For the length of the modification data.
In embodiments of the present invention, polymerization of the control node according to where the file identification of the small documents obtains the small documents The sequence number of the mark of file and the small documents in the aggregate file, polymerization text is read further according to the mark of the aggregate file The attribute information of part, the small documents institute is then obtained according to the sequence number of the attribute information and the small documents in the aggregate file The sequence number of blocks of files and the information of the memory node for storing this document block.
After control node obtains the information of the memory node for storing this document block, send the information of the memory node to Client.
Client sends to corresponding memory node according to the information of the memory node for storing small documents and changes small text Part instructs, and the modification small documents order includes the blocks of files where the file identification for the small documents that will be changed, the small documents Sequence number, by the small text stored in memory node first blocks of files and this document block according to corresponding to being found the sequence number of this document block The index of part, then the modification data of the small documents are inserted at the end for the blocks of files for storing the small documents, while in this document block Index region in, by original position of the data of the small documents in blocks of files be updated to the modification data starting store position Put, and the length of the small documents is updated to the length of the modification data.Change the structure change of the blocks of files after small documents such as Shown in Fig. 7, wherein, 1 mark client in Fig. 7, which instructs to control node transmission modification small documents and receives control node, to be returned Be used for store small documents memory node information, 2 represent control nodes to memory node distribute blocks of files operating right, 3 Represent that client wants memory node sends modification small documents instruction, 4 represent knot of the memory node according to the instruction modification blocks of files Structure.
In the embodiment of the present invention, data cover original is changed without directly using using the end adding modification data in blocks of files The reason for data is:Due to the often cause not of uniform size of the modification data of small documents and former data, therefore former data institute can not be placed on Position.The problem is can solve using the end adding modification data in blocks of files, original small documents data are temporary When be placed on original place, this section of hash can merge after small documents garbage reclamation afterwards.
As a preferred embodiment of the present invention, the modification small documents instruction of client transmission is received in memory node Before, memory node can also receive the distribution blocks of files operating right instruction of control node transmission, and text is distributed to memory node Part block operating right, in the present embodiment, the blocks of files operating right of distribution is modification file permission.
The present embodiment, can be with the sequence number of the blocks of files where small documents by the file identification of small documents, and then basis should Sequence number finds the index of the small documents stored in corresponding blocks of files and this document block, and modification is inserted at the end of this document block Data, while according to the storage location for changing data and the index information for the length renewal small documents for changing data, realize The modification of small documents data.
Embodiment three
Fig. 5 shows the implementation process of the access method for the small documents that the embodiment of the present invention three provides, with control node side Exemplified by illustrate, details are as follows:
Step in the present embodiment is performed after the step S103 in embodiment one.
In step S501, control node receives the reading small documents instruction that client is sent.
In embodiments of the present invention, control node receives the reading small documents instruction that client is sent, and the instruction includes The file identification for the small documents that will be read.
In step S502, control node instructs to obtain for storing depositing for small documents according to the reading small documents received The information of node is stored up, and is sent to client, client is according to the information of the memory node for being used to store small documents to corresponding Memory node, which is sent, reads small documents instruction, is first found in blocks of files and stored according to reading small documents instruction by memory node The small documents index, then read the original position and length information of the small documents stored in index, it is last according to this Beginning position and length information read corresponding to data.
In embodiments of the present invention, polymerization of the control node according to where the file identification of the small documents obtains the small documents The sequence number of the mark of file and the small documents in the aggregate file, polymerization text is read further according to the mark of the aggregate file The attribute information of part, the small documents institute is then obtained according to the sequence number of the attribute information and the small documents in the aggregate file The sequence number of blocks of files and the information of the memory node for storing this document block.
After control node obtains the information of the memory node for storing this document block, send the information of the memory node to Client.
Client sends to corresponding memory node according to the information of the memory node for storing small documents and reads small text Part instructs, and the reading small documents order includes the blocks of files where the file identification for the small documents that will be read, the small documents Sequence number, by the small text stored in memory node first blocks of files and this document block according to corresponding to being found the sequence number of this document block The index of part, then read the original position and length information of the small documents stored in index, finally according to the original position and Data corresponding to length information from file data region reading.
The present embodiment, can be with the sequence number of the blocks of files where small documents by the file identification of small documents, and then basis should Sequence number finds the index of the small documents stored in corresponding blocks of files and this document block, then reads the small text stored in index The original position and length information of part, finally the data according to corresponding to being read the original position and length information, realize small text The reading of number of packages evidence.
Example IV
Fig. 6 shows the implementation process of the delet method for the small documents that the embodiment of the present invention four provides, with control node side Exemplified by illustrate, details are as follows:
Step in the present embodiment is performed after the step S103 in embodiment one.
Deletion function for small documents, carried out using the small documents mode field in the index of the blocks of files in block Record.
In step s 601, control node receives the deletion small documents instruction that client is sent.
In embodiments of the present invention, control node receives the deletion small documents instruction that client is sent, and the instruction includes The file identification for the small documents that will be deleted.
In step S602, control node instructs to obtain for storing depositing for small documents according to the deletion small documents received The information of node is stored up, and is sent to client, client is according to the information of the memory node for being used to store small documents to corresponding Memory node, which is sent, deletes small documents instruction, is first found in blocks of files and stored according to deletion small documents instruction by memory node The small documents index, be deletion state by the small documents status modifier in index.
In embodiments of the present invention, polymerization of the control node according to where the file identification of the small documents obtains the small documents The sequence number of the mark of file and the small documents in the aggregate file, polymerization text is read further according to the mark of the aggregate file The attribute information of part, the small documents institute is then obtained according to the sequence number of the attribute information and the small documents in the aggregate file The sequence number of blocks of files and the information of the memory node for storing this document block.
After control node obtains the information of the memory node for storing this document block, send the information of the memory node to Client.
Client sends to corresponding memory node according to the information of the memory node for storing small documents and deletes small text Part instructs, and the deletion small documents order includes the blocks of files where the file identification for the small documents that will be deleted, the small documents Sequence number, by the small text stored in memory node first blocks of files and this document block according to corresponding to being found the sequence number of this document block The index of part, it is deletion state by the small documents status modifier in index.
As a preferred embodiment of the present invention, the deletion small documents instruction of client transmission is received in memory node Before, memory node can also receive the distribution blocks of files operating right instruction of control node transmission, and text is distributed to memory node Part block operating right, in the present embodiment, the blocks of files operating right of distribution is to delete file permission.
The present embodiment, can be with the sequence number of the blocks of files where small documents by the file identification of small documents, and then basis should Sequence number finds the index of the small documents stored in corresponding blocks of files and this document block, by the small documents status modifier in index To delete state, the deletion action of small documents is realized.
Embodiment five
Fig. 8 shows the implementation process of the liquidation method for the aggregate file that the embodiment of the present invention five provides, with control node Illustrated exemplified by side, details are as follows:
Step in the present embodiment is performed after the step S602 in example IV.
Due to by large amount of small documents polymerization storage into big aggregate file, the deletion of the small documents described in example IV As a result merely by the indications of the index of corresponding small documents in the blocks of files of aggregate file, small documents status modifier is to delete Realized except state, do not delete the data of the small documents actually, it is therefore desirable to be in due course and carry out the rubbish of aggregate file Rubbish reclaims and liquidation operates to compress merging blocks of files.
In step S801, the valid data ratio in the All Files block of control node statistics and convergence file, for having Data rate is imitated less than the blocks of files to fixed-ratio, the information of the memory node of reading storage this document block.
In embodiments of the present invention, valid data ratio is equal to the small documents number for the small documents number/total deleted.
For valid data ratio less than the blocks of files to fixed-ratio, carry out garbage reclamation and liquidation operates, and delete Those blocks of files without valid data.
It should be noted that before step S801, control node can also carry out:Timing scan is before certain time Aggregate file, and initiate aggregate file garbage reclamation and liquidation function.
In step S802, control node sends garbage reclamation and liquidation file block command to corresponding memory node, by Corresponding memory node merges all valid data according to the order, generates new aggregate file.
In embodiments of the present invention, control node is less than to the storage where the blocks of files of fixed-ratio to valid data ratio Node initiates garbage reclamation and liquidation block command, merges all valid data by the memory node, generates new aggregate file, And return to garbage reclamation and liquidation result to control node.
Wherein, it is identical with the mode described in step S102 to merge the process of the new aggregate file of generation, only step It is to be initiated to create the process of aggregate file by client in S102, and during this is voluntarily started by control node.
The present embodiment, on the basis of example IV, realize the deletion behaviour to the data of the small documents in aggregate file Make.
Embodiment six
Fig. 9 shows the concrete structure block diagram for the control node that the embodiment of the present invention six provides, and for convenience of description, only shows The part related to the embodiment of the present invention is gone out.The control node 9 includes:Establishment file instruction reception unit 91, small documents sequence Number generation unit 92 and mark and memory node information generating unit 93.
Wherein, establishment file instruction reception unit 91, for receiving the establishment small documents order of client transmission;
Small documents serial number gencration unit 92, for according to create small documents order create a new aggregate file or from An existing aggregate file is selected in aggregate file list, while small documents are generated poly- according to the attribute information of aggregate file Close the sequence number in file;
Mark and memory node information generating unit 93, for the attribute information according to aggregate file and small documents poly- The mark of serial number gencration small documents and the information of the memory node for storing small documents in file are closed, and is sent to client End, client are write small documents to the transmission of corresponding memory node according to the information of the memory node for storing the small documents and referred to Order, by memory node according to writing small documents instruction by the corresponding blocks of files of the data of small documents write-in, and by the data of small documents The index region of the length write-in blocks of files of original position and small documents in blocks of files.
Specifically, the attribute information of aggregate file includes the mark of aggregate file and the memory block list of aggregate file, Memory block list include the sequence number of each blocks of files in aggregate file, the number of the small documents stored in each blocks of files and Store the information of the memory node of each blocks of files.
Specifically, mark and memory node information generating unit 93 include:
Small documents identifier generation module, for the mark according to the aggregate file and the small documents in the polymerization The mark of small documents described in serial number gencration in file;
Blocks of files serial number gencration module, for according to sequence number of the small documents in the aggregate file and each text The number of the small documents stored in part block, the sequence number of the blocks of files for storing the small documents is calculated;
Memory node data obtaining module, for according to the sequence number for being used to store the blocks of files of the small documents and The information of memory node for storing each blocks of files obtains the information of the memory node for storing the small documents.
Additionally, it is preferred that, the control node 9 also includes:
Small documents instruction reception unit is changed, for receiving the modification small documents instruction of client transmission;
Small documents change unit, for instructing to obtain the letter of the memory node for storing small documents according to modification small documents Breath, and client is sent to, client is sent out according to the information of the memory node for storing small documents to corresponding memory node Modification small documents instruction is sent, by memory node according to the instruction of modification small documents at the end for the blocks of files in storage small documents The modification data of small documents are inserted, while in the index region of blocks of files, by starting of the data of small documents in blocks of files Location updating is the start memory location of modification data, and the length of small documents is updated to change to the length of data;
Small documents instruction reception unit is read, for receiving the reading small documents instruction of client transmission;
Small documents reading unit, for instructing to obtain the letter of the memory node for storing small documents according to reading small documents Breath, and client is sent to, client is sent out according to the information of the memory node for storing small documents to corresponding memory node Send and read small documents instruction, the small documents stored in blocks of files are first found according to small documents instruction is read by memory node Index, then the original position and length information of the small documents stored in index are read, finally according to the original position and length Data corresponding to information reading;
Small documents instruction reception unit is deleted, for receiving the deletion small documents instruction of client transmission;
Small documents delete unit, for instructing to obtain the letter of the memory node for storing small documents according to deletion small documents Breath, and client is sent to, client is sent out according to the information of the memory node for storing small documents to corresponding memory node Send and delete small documents instruction, the small documents stored in blocks of files are first found according to deletion small documents instruction by memory node Index, be deletion state by the small documents status modifier in index;
Memory node Information reading unit, it is right for the valid data ratio in the All Files block of statistics and convergence file In valid data ratio less than the blocks of files to fixed-ratio, the information of the memory node of reading storage this document block;
New aggregate file generation unit, for sending garbage reclamation and liquidation blocks of files life to corresponding memory node Order, all valid data are merged, generate new aggregate file by corresponding memory node according to the order.
Control node provided in an embodiment of the present invention can be applied in foregoing corresponding embodiment of the method, and details are referring to upper The description of preceding method embodiment is stated, will not be repeated here.
The present embodiment, small documents are stored in the blocks of files of aggregate file, the index of small documents is stored in aggregate file Blocks of files in, indexed without outside, the index and data of small documents be fusion in a blocks of files of aggregate file, by cloud The characteristics such as data consistency that storage system is brought, fault-tolerance perfect can embody, using simple, conveniently;And if used outside The mode of index, because the index file and data file of small documents are stored separately, then it must be additionally related to an outside Program, the characteristics such as the uniformity of index file and data file are ensured by the external program.
In addition, the mark of small documents can be carried out with the sequence number of the mark of aggregate file and small documents in aggregate file Unique mark, can be directly by this when subsequently, modifying, delete to small documents, locking, unlocking or accessing small documents Small documents are identified to obtain the blocks of files where small documents, and then can be realized by carrying out operation to this document block.
Embodiment seven
Figure 10 shows the concrete structure block diagram for the memory node that the embodiment of the present invention seven provides, for convenience of description, only Show the part related to the embodiment of the present invention.The memory node 10 includes:Write small documents instruction reception unit 101, data Writing unit 102.
Wherein, small documents instruction reception unit 101 is write, the small documents of writing for receiving client transmission instruct;
Data write unit 102, write small documents instruction for basis and the data of small documents are write into corresponding blocks of files, and By the index region of original position of the data of small documents in blocks of files and the length of small documents write-in blocks of files.
Specifically, data write unit 102 includes:
The identifier acquisition module of small documents, for extracting the marks of small documents from writing in small documents instruction;
Small documents sequence number acquisition module, the mark of the aggregate file for obtaining storing small documents according to the mark of small documents And sequence number of the small documents in aggregate file;
Blocks of files sequence number acquisition module, for being deposited according in sequence number of the small documents in aggregate file and each blocks of files The number of the small documents of storage, the sequence number of the blocks of files for storing the small documents is calculated;
Data write. module, for the data of small documents to be write into blocks of files, and by the data of small documents in blocks of files Original position and small documents length write-in blocks of files index region.
Further, the memory node 10 also includes:
Small documents instruction reception unit is changed, for receiving the modification small documents instruction of client transmission;
File modification unit, for being inserted according to modification small documents instruction at the end of the blocks of files for storing small documents The modification data of small documents, while in the index region of blocks of files, by original position of the data of small documents in blocks of files It is updated to change the start memory location of data, and the length of small documents is updated to change to the length of data;
Small documents instruction reception unit is read, for receiving the reading small documents instruction of client transmission;
Small documents reading unit, for first finding the small documents stored in blocks of files according to reading small documents instruction Index, then the original position and length information of the small documents stored in index are read, finally according to the original position and length Data corresponding to information reading;
Small documents instruction reception unit is deleted, for receiving the deletion small documents instruction of client transmission;
Small documents delete unit, for first finding the small documents stored in blocks of files according to deletion small documents instruction Index, is deletion state by the small documents status modifier in index;
Garbage reclamation and liquidation file block command receiving unit, for receiving garbage reclamation and the liquidation of control node transmission File block command;
New aggregate file generation unit, for merging all valid data according to the order, generate new polymerization text Part.
Memory node provided in an embodiment of the present invention can be applied in foregoing corresponding embodiment of the method, and details are referring to upper The description of preceding method embodiment is stated, will not be repeated here.
The present embodiment, small documents are stored in the blocks of files of aggregate file, the index of small documents is stored in aggregate file Blocks of files in, indexed without outside, the index and data of small documents be fusion in a blocks of files of aggregate file, by cloud The characteristics such as data consistency that storage system is brought, fault-tolerance perfect can embody, using simple, conveniently;And if used outside The mode of index, because the index file and data file of small documents are stored separately, then it must be additionally related to an outside Program, the characteristics such as the uniformity of index file and data file are ensured by the external program.
In addition, the mark of small documents can be carried out with the sequence number of the mark of aggregate file and small documents in aggregate file Unique mark, can be directly by this when subsequently, modifying, delete to small documents, locking, unlocking or accessing small documents Small documents are identified to obtain the blocks of files where small documents, and then can be realized by carrying out operation to this document block.
Embodiment eight
Figure 11 shows the concrete structure block diagram of the access system for the small documents that the embodiment of the present invention eight provides, for the ease of Illustrate, illustrate only the part related to the embodiment of the present invention.The access system 11 of the small documents includes client 111, strictly according to the facts The control node 112 and the memory node 113 as described in embodiment seven, a control node for applying the description of example six are deposited with least one Node connection is stored up, client is connected with control node and memory node respectively.Wherein, client is used to send to control node and grasped Make file request command, the order includes creating small documents order, can also include the order of modification small documents, access small documents life Make, delete small documents order, garbage reclamation and liquidation file block command.The attribute letter of aggregate file is stored with control node Breath, after the order for receiving client transmission, control node can be according to the command lookup to the storage for storing small documents The information of node, and be sent to client, client is according to the information of the memory node for storing small documents to depositing accordingly Storage node sends the order again, is operated accordingly according to the order to being stored in blocks of files therein by memory node.
The present embodiment, small documents are stored in the blocks of files of aggregate file, the index of small documents is stored in aggregate file Blocks of files in, indexed without outside, the index and data of small documents be fusion in a blocks of files of aggregate file, by cloud The characteristics such as data consistency that storage system is brought, fault-tolerance perfect can embody, using simple, conveniently;And if used outside The mode of index, because the index file and data file of small documents are stored separately, then it must be additionally related to an outside Program, the characteristics such as the uniformity of index file and data file are ensured by the external program.
In addition, the mark of small documents can be carried out with the sequence number of the mark of aggregate file and small documents in aggregate file Unique mark, can be directly by this when subsequently, modifying, delete to small documents, locking, unlocking or accessing small documents Small documents are identified to obtain the blocks of files where small documents, and then can be realized by carrying out operation to this document block.
The access system of small documents provided in an embodiment of the present invention can be applied in foregoing corresponding embodiment of the method, in detail Feelings will not be repeated here referring to the description of above-mentioned preceding method embodiment.
It is worth noting that, in said system embodiment, included unit is simply drawn according to function logic Point, but above-mentioned division is not limited to, as long as corresponding function can be realized;In addition, each functional unit is specific Title is also only to facilitate mutually distinguish, the protection domain being not intended to limit the invention.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement made within refreshing and principle etc., should be included in the scope of the protection.

Claims (11)

1. a kind of access method of small documents, it is characterised in that methods described includes:
Control node receives the establishment small documents order that client is sent;
Control node creates a new aggregate file according to the establishment small documents order or selected from aggregate file list An existing aggregate file is selected, while the small documents are generated in the polymerization text according to the attribute information of the aggregate file Sequence number in part;The aggregate file list includes the file identification id of each aggregate file in memory node;
Control node is given birth to according to sequence number of the attribute information and the small documents of the aggregate file in the aggregate file Into the mark of the small documents and the information of the memory node for storing the small documents, and send to client, client End sends to corresponding memory node according to the information for the memory node for being used to store the small documents and writes small documents instruction, by institute State memory node small documents instruction is write according to and the data of the small documents are write into corresponding blocks of files, and by the small text The length of original position and the small documents of the data of part in blocks of files writes the index region of the blocks of files;
The attribute information of the aggregate file includes the mark of aggregate file and the memory block list of aggregate file, the storage Block list includes number and the storage of the small documents stored in the sequence number of each blocks of files in aggregate file, each blocks of files The information of the memory node of each blocks of files;
The mark of the generation small documents and the information of the memory node for storing the small documents, including:
Serial number gencration institute of the control node according to the mark and the small documents of the aggregate file in the aggregate file State the mark of small documents;
Control node is according to the small documents stored in sequence number of the small documents in the aggregate file and each blocks of files Number, the sequence number of the blocks of files for storing the small documents is calculated;
Control node is used to store the sequence number of the blocks of files of the small documents and for storing each blocks of files according to described The information of memory node obtains the information of the memory node for storing the small documents;
The blocks of files includes index segment data and file data two parts;
Wherein, index segment data include Head and index region, each small documents that index region is used in storage file block Corresponding index, a small documents uniquely index for corresponding one.
2. the method as described in claim 1, it is characterised in that the memory node writes small documents instruction by described according to The data of small documents write corresponding blocks of files, and by original position of the data of the small documents in blocks of files and described small The index region that the length of file writes the blocks of files includes:
Memory node extracts the marks of the small documents from described write in small documents instruction;
Memory node obtains storing the mark of the aggregate file of the small documents and described small according to the mark of the small documents Sequence number of the file in the aggregate file;
Memory node is according to the small documents stored in sequence number of the small documents in the aggregate file and each blocks of files Number, the sequence number of the blocks of files for storing the small documents is calculated;
The data of the small documents are write the blocks of files by memory node, and by the data of the small documents in blocks of files Original position and the length of the small documents write the index region of the blocks of files.
3. the method as described in any one of claim 1 to 2, it is characterised in that in the control node according to the polymerization text The mark of small documents described in the serial number gencration of the attribute information of part and the small documents in the aggregate file and it is used for The information of the memory node of the small documents is stored, and is sent to client, client is according to for storing the small documents The information of memory node is sent to corresponding memory node writes small documents instruction, and small documents are write according to as the memory node The data of the small documents are write corresponding blocks of files, and the start bit by the data of the small documents in blocks of files by instruction Put and write with the length of the small documents after the index region of the blocks of files, methods described also includes:
Control node receives the modification small documents instruction that client is sent;
Control node instructs to obtain the information of the memory node for storing small documents according to the modification small documents, and is sent to Client, client send modification small documents according to the information of the memory node for storing small documents to corresponding memory node Instruction, by memory node according to the modification small documents instruction for being inserted at the end for storing the blocks of files of the small documents The modification data of the small documents, while in the index region of the blocks of files, by the data of the small documents in blocks of files In original position be updated to the start memory location of the modification data, and the length of the small documents is updated to described repair Change the length of data.
4. the method as described in any one of claim 1 to 2, it is characterised in that in the control node according to the polymerization text The mark of small documents described in the serial number gencration of the attribute information of part and the small documents in the aggregate file and it is used for The information of the memory node of the small documents is stored, and is sent to client, client is according to for storing the small documents The information of memory node is sent to corresponding memory node writes small documents instruction, and small documents are write according to as the memory node The data of the small documents are write corresponding blocks of files, and the start bit by the data of the small documents in blocks of files by instruction Put and write with the length of the small documents after the index region of the blocks of files, methods described also includes:
Control node receives the reading small documents instruction that client is sent;
Control node instructs to obtain the information of the memory node for storing small documents according to the reading small documents, and is sent to Client, client send to corresponding memory node according to the information of the memory node for storing small documents and read small documents Instruction, the index of the small documents stored in blocks of files is first found according to the reading small documents instruction by memory node, then The original position and length information of the small documents stored in index are read, is finally read according to the original position and length information Take corresponding data.
5. the method as described in any one of claim 1 to 2, it is characterised in that in the control node according to the polymerization text The mark of small documents described in the serial number gencration of the attribute information of part and the small documents in the aggregate file and it is used for The information of the memory node of the small documents is stored, and is sent to client, client is according to for storing the small documents The information of memory node is sent to corresponding memory node writes small documents instruction, and small documents are write according to as the memory node The data of the small documents are write corresponding blocks of files, and the start bit by the data of the small documents in blocks of files by instruction Put and write with the length of the small documents after the index region of the blocks of files, methods described also includes:
Control node receives the deletion small documents instruction that client is sent;
Control node instructs to obtain the information of the memory node for storing small documents according to the deletion small documents, and is sent to Client, client send to corresponding memory node according to the information of the memory node for storing small documents and delete small documents Instruction, the index of the small documents stored in blocks of files is first found according to deletion small documents instruction by memory node, by rope Small documents status modifier in drawing is deletion state.
6. method as claimed in claim 5, it is characterised in that instructed in the control node according to the deletion small documents To the information of the memory node for storing small documents, and client is sent to, client is according to for storing depositing for small documents The information for storing up node sends deletion small documents instruction to corresponding memory node, is instructed by memory node according to the deletion small documents First find the index of the small documents stored in blocks of files, by the small documents status modifier in index be deletion state after, Methods described also includes:
Valid data ratio in the All Files block of control node statistics and convergence file, for valid data ratio less than given The blocks of files of ratio, read the information of the memory node of storage this document block;
Control node to corresponding memory node send garbage reclamation and liquidation file block command, by corresponding memory node according to The order merges all valid data, generates new aggregate file.
7. a kind of control node, it is characterised in that the control node includes:
Establishment file instruction reception unit, for receiving the establishment small documents order of client transmission;
Small documents serial number gencration unit, for creating a new aggregate file or from poly- according to the establishment small documents order Close and an existing aggregate file is selected in listed files, while the small text is generated according to the attribute information of the aggregate file Sequence number of the part in the aggregate file;The aggregate file list includes the files-designated of each aggregate file in memory node Know id;
Mark and memory node information generating unit, exist for the attribute information according to the aggregate file and the small documents The letter of the mark of small documents described in serial number gencration in the aggregate file and the memory node for storing the small documents Breath, and send to client, client and section is stored to corresponding according to the information of the memory node for storing the small documents Small documents instruction is write in point transmission, and small documents instruction is write according to as the memory node writes phase by the data of the small documents The blocks of files answered, and by described in original position of the data of the small documents in blocks of files and the write-in of the length of the small documents The index region of blocks of files;
The attribute information of the aggregate file includes the mark of aggregate file and the memory block list of aggregate file, the storage Block list includes number and the storage of the small documents stored in the sequence number of each blocks of files in aggregate file, each blocks of files The information of the memory node of each blocks of files;
The mark and memory node information generating unit include:
Small documents identifier generation module, for the mark according to the aggregate file and the small documents in the aggregate file In serial number gencration described in small documents mark;
Blocks of files serial number gencration module, for according to sequence number of the small documents in the aggregate file and each blocks of files The number of the small documents of middle storage, the sequence number of the blocks of files for storing the small documents is calculated;
Memory node data obtaining module, for being used to store the sequence number of the blocks of files of the small documents according to and being used for The information for storing the memory node of each blocks of files obtains the information of the memory node for storing the small documents;
The blocks of files includes index segment data and file data two parts;
Wherein, index segment data include Head and index region, each small documents that index region is used in storage file block Corresponding index, a small documents uniquely index for corresponding one.
8. control node as claimed in claim 7, it is characterised in that also include in the control node:
Small documents instruction reception unit is changed, for receiving the modification small documents instruction of client transmission;
Small documents change unit, for instructing to obtain the letter of the memory node for storing small documents according to the modification small documents Breath, and client is sent to, client is sent out according to the information of the memory node for storing small documents to corresponding memory node Modification small documents instruction is sent, by memory node according to the modification small documents instruction for storing the file of the small documents The modification data of the small documents are inserted at the end of block, while in the index region of the blocks of files, by the small documents Original position of the data in blocks of files is updated to the start memory location of the modification data, and by the length of the small documents It is updated to the length of the modification data;
Small documents instruction reception unit is read, for receiving the reading small documents instruction of client transmission;
Small documents reading unit, for instructing to obtain the letter of the memory node for storing small documents according to the reading small documents Breath, and client is sent to, client is sent out according to the information of the memory node for storing small documents to corresponding memory node Send and read small documents instruction, the small text stored in blocks of files is first found according to the reading small documents instruction by memory node The index of part, then the original position and length information of the small documents stored in index are read, finally according to the original position With length information read corresponding to data;
Small documents instruction reception unit is deleted, for receiving the deletion small documents instruction of client transmission;
Small documents delete unit, for instructing to obtain the letter of the memory node for storing small documents according to the deletion small documents Breath, and client is sent to, client is sent out according to the information of the memory node for storing small documents to corresponding memory node Send and delete small documents instruction, the small documents stored in blocks of files are first found according to deletion small documents instruction by memory node Index, be deletion state by the small documents status modifier in index;
Memory node Information reading unit, for the valid data ratio in the All Files block of statistics and convergence file, for having Data rate is imitated less than the blocks of files to fixed-ratio, the information of the memory node of reading storage this document block;
New aggregate file generation unit, for sending garbage reclamation and liquidation file block command to corresponding memory node, by Corresponding memory node merges all valid data according to the order, generates new aggregate file.
9. a kind of memory node, it is characterised in that the memory node includes:
Small documents instruction reception unit is write, the small documents of writing for receiving client transmission instruct;
Data write unit, the data of the small documents are write into corresponding blocks of files for writing small documents instruction according to, And original position of the data of the small documents in blocks of files and the length of the small documents are write to the rope of the blocks of files Draw region;
The data write unit includes:
The identifier acquisition module of small documents, for extracting the marks of the small documents from described write in small documents instruction;
Small documents sequence number acquisition module, for according to the mark of the small documents obtaining storing the aggregate file of the small documents The sequence number of mark and the small documents in the aggregate file;
Blocks of files sequence number acquisition module, for according to sequence number of the small documents in the aggregate file and each blocks of files The number of the small documents of middle storage, the sequence number of the blocks of files for storing the small documents is calculated;
Data write. module, for the data of the small documents to be write into the blocks of files, and the data of the small documents are existed The length of original position and the small documents in blocks of files writes the index region of the blocks of files;
The blocks of files includes index segment data and file data two parts;Wherein, index segment data include Head and index area Domain, index corresponding to each small documents that index region is used in storage file block, the unique corresponding rope of a small documents Draw.
10. memory node as claimed in claim 9, it is characterised in that the memory node also includes:
Small documents instruction reception unit is changed, for receiving the modification small documents instruction of client transmission;
File modification unit, for being instructed according to the modification small documents at the end for the blocks of files for storing the small documents Tail inserts the modification data of the small documents, while in the index region of the blocks of files, the data of the small documents are existed Original position in blocks of files is updated to the start memory location of the modification data, and the length of the small documents is updated to The length of the modification data;
Small documents instruction reception unit is read, for receiving the reading small documents instruction of client transmission;
Small documents reading unit, for first finding the small documents stored in blocks of files according to the reading small documents instruction Index, then the original position and length information of the small documents stored in index are read, finally according to the original position and length Spend data corresponding to information reading;
Small documents instruction reception unit is deleted, for receiving the deletion small documents instruction of client transmission;
Small documents delete unit, for first finding the small documents stored in blocks of files according to the deletion small documents instruction Index, is deletion state by the small documents status modifier in index;
Garbage reclamation and liquidation file block command receiving unit, for receiving the garbage reclamation and liquidation file of control node transmission Block command;
New aggregate file generation unit, for merging all valid data according to the order, generate new aggregate file.
11. the access system of a kind of small documents, it is characterised in that it is any that the system includes client, such as claim 7 to 8 Described in control node and the memory node as described in any one of claim 9 to 10, the control node with it is at least one Memory node is connected, and the client is connected with the control node and the memory node respectively.
CN201310575327.7A 2013-11-15 2013-11-15 A kind of access method of small documents, system and control node and memory node Active CN103605726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310575327.7A CN103605726B (en) 2013-11-15 2013-11-15 A kind of access method of small documents, system and control node and memory node

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310575327.7A CN103605726B (en) 2013-11-15 2013-11-15 A kind of access method of small documents, system and control node and memory node

Publications (2)

Publication Number Publication Date
CN103605726A CN103605726A (en) 2014-02-26
CN103605726B true CN103605726B (en) 2017-11-14

Family

ID=50123948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310575327.7A Active CN103605726B (en) 2013-11-15 2013-11-15 A kind of access method of small documents, system and control node and memory node

Country Status (1)

Country Link
CN (1) CN103605726B (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630779A (en) * 2014-10-27 2016-06-01 杭州海康威视系统技术有限公司 Hadoop distributed file system based small file storage method and apparatus
CN105808622A (en) * 2014-12-31 2016-07-27 乐视网信息技术(北京)股份有限公司 File storage method and device
CN106326292B (en) * 2015-06-29 2020-05-19 杭州海康威视数字技术股份有限公司 Data structure and file aggregation and reading method and device
US10983732B2 (en) * 2015-07-13 2021-04-20 Pure Storage, Inc. Method and system for accessing a file
CN105282244B (en) 2015-09-29 2018-10-02 华为技术有限公司 a kind of data processing method, device, server and controller
CN105677904B (en) * 2016-02-04 2019-07-12 杭州数梦工场科技有限公司 Small documents storage method and device based on distributed file system
CN105912664B (en) * 2016-04-11 2020-02-14 华为技术有限公司 File processing method and equipment
CN108228673B (en) * 2016-12-22 2021-09-03 上海凯翔信息科技有限公司 Method and system for rapidly merging files
CN106649860B (en) * 2016-12-30 2020-09-18 苏州浪潮智能科技有限公司 Defragmentation method applied to aggregated files
CN106874214B (en) * 2017-02-15 2022-08-02 腾讯科技(深圳)有限公司 Cloud hard disk resource recovery method and related device
CN107291915A (en) * 2017-06-27 2017-10-24 北京奇艺世纪科技有限公司 A kind of small documents storage method, small documents read method and system
CN107506466B (en) * 2017-08-30 2020-08-04 郑州云海信息技术有限公司 Small file storage method and system
CN107562915A (en) * 2017-09-12 2018-01-09 郑州云海信息技术有限公司 Read the method, apparatus and equipment and computer-readable recording medium of small documents
CN107704203B (en) * 2017-09-27 2021-08-31 郑州云海信息技术有限公司 Deletion method, device and equipment for aggregated large file and computer storage medium
CN107807989B (en) * 2017-11-03 2020-03-24 绿湾网络科技有限公司 Small file processing method and device
CN108156040A (en) * 2018-01-30 2018-06-12 北京交通大学 A kind of central control node in distribution cloud storage system
CN108763473A (en) * 2018-05-29 2018-11-06 郑州云海信息技术有限公司 A kind of the native object storage method and device of distributed storage
CN108958653A (en) * 2018-06-26 2018-12-07 郑州云海信息技术有限公司 A kind of space reclamation method, system and relevant apparatus based on bottom aggregate file
CN109063192B (en) * 2018-08-29 2021-01-29 江苏云从曦和人工智能有限公司 Working method of high-performance mass file storage system
CN110874182B (en) * 2018-08-31 2023-12-26 杭州海康威视系统技术有限公司 Processing method, device and equipment for strip index
CN109391787A (en) * 2018-09-30 2019-02-26 武汉中科通达高新技术股份有限公司 File format, image polymerization and read method
CN110147203B (en) * 2019-05-16 2022-11-04 北京金山云网络技术有限公司 File management method and device, electronic equipment and storage medium
CN111400302B (en) * 2019-11-28 2023-09-19 杭州海康威视系统技术有限公司 Modification method, device and system for continuous storage data
CN111949617A (en) * 2020-09-11 2020-11-17 苏州浪潮智能科技有限公司 Aggregate file object header management method, system, terminal and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101854388A (en) * 2010-05-17 2010-10-06 浪潮(北京)电子信息产业有限公司 Method and system concurrently accessing a large amount of small documents in cluster storage
CN102332029A (en) * 2011-10-15 2012-01-25 西安交通大学 Hadoop-based mass classifiable small file association storage method
CN102662992A (en) * 2012-03-14 2012-09-12 北京搜狐新媒体信息技术有限公司 Method and device for storing and accessing massive small files

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8019790B2 (en) * 2006-07-11 2011-09-13 Dell Products, Lp System and method of dynamically changing file representations
CN102708197B (en) * 2012-05-16 2016-09-21 Tcl集团股份有限公司 A kind of multimedia file management method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101854388A (en) * 2010-05-17 2010-10-06 浪潮(北京)电子信息产业有限公司 Method and system concurrently accessing a large amount of small documents in cluster storage
CN102332029A (en) * 2011-10-15 2012-01-25 西安交通大学 Hadoop-based mass classifiable small file association storage method
CN102662992A (en) * 2012-03-14 2012-09-12 北京搜狐新媒体信息技术有限公司 Method and device for storing and accessing massive small files

Also Published As

Publication number Publication date
CN103605726A (en) 2014-02-26

Similar Documents

Publication Publication Date Title
CN103605726B (en) A kind of access method of small documents, system and control node and memory node
JP6479020B2 (en) Hierarchical chunking of objects in a distributed storage system
CN110096891B (en) Object signatures in object libraries
US8849759B2 (en) Unified local storage supporting file and cloud object access
JP5822452B2 (en) Storage service providing apparatus, system, service providing method, and service providing program
CN103179185B (en) Method and system for creating files in cache of distributed file system client
US11287994B2 (en) Native key-value storage enabled distributed storage system
CN109522283B (en) Method and system for deleting repeated data
US20130339406A1 (en) System and method for managing filesystem objects
CN106446001B (en) A kind of method and system of the storage file in computer storage medium
CN106105161A (en) To cloud data storage device Backup Data while maintaining storage efficiency
US20180089033A1 (en) Performing data backups using snapshots
CN110647497A (en) HDFS-based high-performance file storage and management system
KR20100070895A (en) Metadata server and metadata management method
JP2014503086A (en) File system and data processing method
CN104184812B (en) A kind of multipoint data transmission method based on private clound
US9749132B1 (en) System and method for secure deletion of data
CN110083552A (en) The reduction redundancy of storing data
CN105100146A (en) Data storage method, device and system
CN106775446A (en) Based on the distributed file system small documents access method that solid state hard disc accelerates
JP2011076294A (en) Method and system for transferring duplicate file in hierarchical storage management system
CN103002027A (en) System and method for data storage on basis of key-value pair system tree-shaped directory achieving structure
US20130332418A1 (en) Method of managing data in asymmetric cluster file system
CN106951375A (en) The method and device of snapped volume is deleted within the storage system
CN105824723B (en) The method and system that a kind of data to publicly-owned cloud storage account are backed up

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20201123

Address after: 554300 b1006, 10th floor, building B, main building, Wanshan District, Tongren City, Guizhou Province

Patentee after: Zhicheng Jianke Design Co., Ltd

Address before: Raycom Information Center 2 Beijing City No. 100190 Haidian District road block C academy north building 17 layer 12-13

Patentee before: CHINA SECURITY & FIRE TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right