CN108984686A - A kind of distributed file system indexing means and device merged based on log - Google Patents
A kind of distributed file system indexing means and device merged based on log Download PDFInfo
- Publication number
- CN108984686A CN108984686A CN201810718623.0A CN201810718623A CN108984686A CN 108984686 A CN108984686 A CN 108984686A CN 201810718623 A CN201810718623 A CN 201810718623A CN 108984686 A CN108984686 A CN 108984686A
- Authority
- CN
- China
- Prior art keywords
- file
- index
- log
- type
- delete
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a kind of distributed file system indexing means and device merged based on log, pass through the meta data server of distributed file system, when log merges, construct file operation metadata information and write storage unit, then it reads and parses the operation metadata information in storage unit, file index operation is finally executed, manipulative indexing is established, deletes processed object.The present invention can solve index omit, client poor compatibility, can not incremental build file index and building time-consuming, the lower problem of efficiency.
Description
Technical field
The invention belongs to file storage and processing technology field more particularly to a kind of distributed documents merged based on log
System index method and apparatus.
Background technique
With internet, cloud computing, the fast development of big data, artificial intelligence, according to the market Internet data center (IDC)
Survey institute is, it is expected that global metadata total amount will be increased with annual 50% or so speed, and to the year two thousand twenty, global metadata total amount will
Reach 40ZB (hundred million TB of 1ZB=10).In these data, only about 15% data can be accessed often, most data
Will gradually it turn cold after a birth.Although the rate of people logging in of these " cold datas " is very low, but need to retain these data, and
For enterprise, there are also mass data to need to store and retrieve.
Log file system (Journaling File System) is the file system with failover capability,
It records the modification for being not yet submitted to file system using log, to prevent metadata to be destroyed.Relative to non-journal file
System substantially increases the stability of file system, reliability is increased in system crash or power-off, when shortening recovery
Between, it ensure that the atomicity of file operation.
Currently, the mode for establishing retrieval to file, which is broadly divided into, establishes index in client and server-side, built in client
Lithol draws the type for needing to consider various clients, and compatibility is poor.Index, which is established, in server-side is mainly the following method:
By the operation of monitoring file system carry file, to establish manipulative indexing;
By traversing the file under specified carry file, to establish file index;
Above-mentioned technical method the problem is that, in the first method, some operations can not listen to, such as moving operation
(mv), can omit the indexes of some files in this way.Second method needs to be traversed for all subfiles under file
Folder and file, can not carry out incremental build, and when quantity of documents is very big, ergodic process time-consuming is very long, and efficiency is lower.
In addition, each operation associated with the file is had recorded in the log of log file system, if based on there is no day
The file system of will pooling function goes building to index, then needs to parse each operation, and also resulting in index, to establish efficiency lower.
Summary of the invention
The purpose of the present invention is to provide it is a kind of based on log merge distributed file system indexing means and device, with
Solve index omit, client poor compatibility, can not incremental build file index and building time-consuming, efficiency is lower to ask
Topic.
To achieve the above object, the technical scheme adopted by the invention is as follows:
A kind of distributed file system indexing means merged based on log, the distributed document merged based on log
System index method, comprising:
Step 1: recording file operation information, and log is written, the file operation information when file operation occurs
The time of origin of type and file operation including file operation, and when the type of file operation is moving operation, remembering
After having recorded file operation information and log being written, file operation metadata information is constructed immediately;
Step 2: in the case where meeting trigger condition, execution journal union operation;
Step 3: for the file being modified in log union operation, when the type of the file operation of generation be creation/
When delete operation, file operation metadata information is constructed, and information memory cell is written;When the type of the file operation of generation is
When moving operation, information memory cell is written into the file operation metadata information having been built up;
Step 4: reading the file operation metadata information in information memory cell;
Step 5: being parsed to the file operation metadata information read, according to file operation resulting after parsing
Type, execute the operation of corresponding file index;
Step 6: deleting the processed object in information memory cell after the completion of the execution of All Files index operation.
Further, the record file operation information, comprising:
The type of file operation, file behaviour are recorded respectively by increasing field in the directory entry structure of the file operated
The index node of deleted file when the time of origin and delete operation of work;
By the title for increasing field record deleted file in the bibliographic structure of the file operated.
Further, described in the case where meeting trigger condition, execution journal union operation, comprising:
When the quantity of log is more than given threshold or receives log combine command, day is carried out as unit of the catalogue of file
Will union operation.
Further, the building file operation metadata information, comprising:
When the type of file operation is creation operation, obtain the type of the file operation, file operation generation when
Between, filename, file size, file path, filemodetime, whether need delete with path file index and whether be
File, and construct operation information character string;
When the type of file operation is delete operation, obtain the type of the file operation, file operation generation when
Between, file path and whether be file, and construct operation information character string;
When the type of file operation is moving operation, the source path, destination path, file for obtaining the file operation are big
Small, filename, filemodetime and whether be file, and construct operation information character string.
Further, the described pair of file operation metadata information read parses, according to text resulting after parsing
The type of part operation executes corresponding file index operation, comprising:
When the type of file operation is creation operation, first judge whether to need to delete with path file index, if needed
Delete, then first delete indexed set in the identical file index in path, it is on the contrary then without operation;Then file is judged whether it is again,
If it is file, then file index creation operation is constructed and executed, if not file, is then terminated;
When the type of file operation is delete operation, first judge whether deletion object is file, if it is file, then structure
It builds and executes and delete file index operation;If it is file, then constructs and the file index executed under this document folder deletes behaviour
Make;
When the type of file operation is moving operation, first judge whether mobile object is file, if it is file, then first
It constructs and executes file index delete operation under source path, then construct and execute file index creation operation under destination path;
If it is file, then the file index under this folder path is first retrieved, takes out file path, modification in source file index
Then file path is updated to destination path, filename, modification time by source path by time, file size, the value of filename
It is constant with file size, file index creation operation under destination path is constructed and executed, then constructs and executes source path hereafter
Part indexes delete operation.
It is described to be merged based on log the present invention also provides a kind of distributed file system indexing unit merged based on log
Distributed file system indexing unit, including update building writing unit module, information memory cell module and solution it is destructed
Build performance unit module, in which:
The update constructs writing unit module, for recording file operation information, and be written when file operation occurs
Log, the file operation information includes the type of file operation and the time of origin of file operation, and in file operation
When type is moving operation, after having recorded file operation information and log is written, file operation metadata information is constructed immediately;
And in the case where meeting trigger condition, execution journal union operation;For the file being modified in log union operation, work as generation
The type of file operation when being creation/delete operation, construct file operation metadata information, and information memory cell mould be written
Block;When the type of the file operation of generation is moving operation, information is written into the file operation metadata information having been built up
Storage unit module;
The information memory cell module updates file operation member constructed in building writing unit module for storing
Data information;
The parsing constructs performance unit module, for reading the letter of the file operation metadata in information memory cell module
Breath;The file operation metadata information read is parsed, according to the type of file operation resulting after parsing, executes phase
The file index operation answered;After the completion of the execution of All Files index operation, delete processed in information memory cell module
Object.
Further, described when file operation occurs, file operation information is recorded, is performed the following operations:
Building writing unit module is updated by increasing field record file in the directory entry structure of the file operated
The index node of deleted file when the type of operation, the time of origin of file operation and delete operation;By being operated
File bibliographic structure in increase field record by the title of operation file.
Further, described in the case where meeting trigger condition, the concrete operations of execution journal union operation, execution are as follows:
When the quantity of log is more than given threshold/receive log combine command, building writing unit module is updated with text
The catalogue of part is that unit carries out log union operation.
Further, the building file operation metadata information, performs the following operations:
When the type of file operation is creation operation, the class that building writing unit module obtains the file operation is updated
Whether type the time of origin of file operation, filename, file size, file path, filemodetime, needs to delete and go the same way
Diameter file index and whether be file, and construct operation information character string;
When the type of file operation is delete operation, the class that building writing unit module obtains the file operation is updated
Type, the time of origin of file operation, file path and whether be file, and construct operation information character string;
When the type of file operation is moving operation, the source that building writing unit module obtains the file operation is updated
Path, destination path, file size, filename, filemodetime and whether be file, and construct operation information character
String.
Further, the described pair of file operation metadata information read parses, according to text resulting after parsing
The type of part operation executes corresponding file index operation, performs the following operations:
When the type of file operation is creation operation, parsing building performance unit module first judges whether to need to delete same
Path file index, if necessary to delete, then first delete indexed set in the identical file index in path, it is on the contrary then without operation;So
Judge whether it is file again afterwards, if it is file, then constructs and execute file index creation operation, if not file, then return
It returns;
When the type of file operation be delete operation when, parsing building performance unit module first judge deletion object whether be
File is then constructed and is executed and delete file index operation if it is file;If it is file, then constructs and execute this document
File index behaviour under folder, which deletes, to be made.
When the type of file operation be moving operation when, parsing building performance unit module first judge mobile object whether be
File then first constructs and executes file index delete operation under source path if it is file, then construct and execute destination path
Lower file index creation operation;If it is file, then the file index under this folder path is first retrieved, takes out source file
The value of file path, modification time, file size, filename in index, for the purpose of then updating file path by source path
Path, filename, modification time and file size are constant, construct and execute file index creation operation under destination path, then
It constructs and executes file index delete operation under source path.
A kind of distributed file system index establishing method and device merged based on log provided by the invention, by dividing
The meta data server of cloth file system, when log merges, building file operation metadata information and write storage unit,
Then it reads and parses the operation metadata information in storage unit, finally execute file index operation, establish manipulative indexing, delete
Except processed object.The present invention, can be to avoid the loss of file operation, simultaneously by obtaining operation information in meta data server
Without considering the type of client, solves the problem of index omission, client poor compatibility.In addition, being merged by log, increase
Amount obtain a period of time in file operation information, reduce index construct number, solve can not incremental build file index,
The problem of time-consuming for building, low efficiency.The present invention can solve index omit, client poor compatibility, can not incremental build text
Time-consuming for part index and building, the lower problem of efficiency.
Detailed description of the invention
Fig. 1 is a kind of embodiment flow diagram of the distributed file system indexing means merged the present invention is based on log;
Fig. 2 is a kind of creation operation of embodiment of the distributed file system indexing means merged the present invention is based on log
When file operation information record flow diagram;
Fig. 3 is a kind of delete operation of embodiment of the distributed file system indexing means merged the present invention is based on log
When file operation information record flow diagram;
Fig. 4 is a kind of moving operation of embodiment of the distributed file system indexing means merged the present invention is based on log
When file operation information record flow diagram;
Fig. 5 is a kind of creation operation of embodiment of the distributed file system indexing means merged the present invention is based on log
When file operation metadata information architecture flow diagram;
Fig. 6 is a kind of delete operation of embodiment of the distributed file system indexing means merged the present invention is based on log
When file operation metadata information architecture flow diagram;
Fig. 7 is a kind of moving operation of embodiment of the distributed file system indexing means merged the present invention is based on log
When file operation metadata information architecture flow diagram;
Fig. 8 is a kind of creation operation of embodiment of the distributed file system indexing means merged the present invention is based on log
The file index operating process block diagram of Shi Zhihang;
Fig. 9 is a kind of delete operation of embodiment of the distributed file system indexing means merged the present invention is based on log
The file index operating process block diagram of Shi Zhihang;
Figure 10 is a kind of mobile behaviour of embodiment of the distributed file system indexing means merged the present invention is based on log
As when the file index operating process block diagram that executes;
Figure 11 is a kind of example structure signal of the distributed file system indexing unit merged the present invention is based on log
Figure;
Figure 12 is that the update of the distributed file system indexing unit merged the present invention is based on log constructs writing unit mould
A kind of embodiment functional schematic of block.
Specific embodiment
Technical solution of the present invention is described in further details with reference to the accompanying drawings and examples, following embodiment is not constituted
Limitation of the invention.
The present embodiment provides a kind of distributed file system indexing means merged based on log, as shown in Figure 1, this is based on
The distributed file system indexing means that log merges, comprising:
S101, when file operation occurs, record file operation information, and log be written, the file operation packet
The type of file operation and the time of origin of file operation are included, and when the type of file operation is moving operation, is being recorded
Complete file operation information and after log is written, creates file operation metadata information immediately.
All operation objects in the present embodiment can be file or folder, such as: " file operation " unless otherwise specified
Not refer in particular to the operation for file, but the also operation of file.And the type of file operation includes: creation operation, deletes
Operation and moving operation, wherein include creation and modification operation in creation operation, include movement and renaming in moving operation
Operation.
The record file operation information, mainly by increasing in the directory entry of the file operated (dentry) structure
Operation, etime and preInode field record respectively the type of file operation, file operation time of origin and delete
The index node of deleted file when except operation;By increasing deleted_ in the catalogue of the file operated (dir) structure
Files field records the title of deleted file.
Specifically, as shown in Fig. 2, recording file operation information when the type of the file operation of generation is creation operation,
It include: to increase operation and etime field in the directory entry structure of creation object, and operation field is updated to
Etime field is updated to current time by create (creation), is terminated.
As shown in figure 3, the operation of file operation information is recorded when the type of the file operation of generation is delete operation,
It include: to increase operation, etime and preInode field in the directory entry structure for deleting object, and by operation
Field is updated to unlink (deletion), and etime field is updated to current time;Judge to delete whether object is file afterwards, if
For file, then the preInode field in directory entry structure is updated to 0;It, then will be in directory entry structure if file
PreInode field is updated to the index node (inode) of deleted file, and increases in the bibliographic structure for deleting object
Deleted_files field record deletes the title of object, terminates.
As shown in figure 4, needing respectively when the type of the file operation of generation is moving operation in source directory item (src
Dentry file operation information) and in destination directory item (dest dentry) is recorded, file operation information is recorded, comprising: is first existed
Increase operation, etime and preInode field in the source directory item structure of mobile object, and more by operation field
It is newly move (movement), etime field is updated to current time, preInode field is updated to source index node;Exist afterwards
Increase operation and etime field in the destination directory item structure of mobile object, and operation field is updated to
Etime field is updated to current time by move (movement), is terminated.
Moving operation creates file operation metadata information, content construction after having recorded file operation information immediately
Including following information: the source path of file operation, destination path, file size, filename, filemodetime and whether be
File.
Specifically, as shown in fig. 7, when moving operation, file operation metadata information is constructed, comprising: first obtain mobile pair
The source path of elephant, after according to source path obtain purpose object indexing node, by purpose object indexing node obtain destination path,
It according to the directory entry of the object in destination path acquisition file size, filename, filemodetime and whether is file, root
According to the information architecture moving operation message character string of acquisition, and by increasing index_move structure in file system cache,
The moving operation message character string of building is recorded in index_move structure for use, is terminated.
Since file operation information is recorded by increasing field in catalogue and directory entry structure, the present embodiment note
Recording file operation information is to be recorded in caching, and distributed file system itself will record system log, so these information
Also it can follow file operation event that log is written together.The meta data server of distributed file system can pass through after restarting
Playback log content caches to establish, and when playing back log, the operation metadata information being recorded in file directory item can also add
It is downloaded in caching, will not omit or lose file operation information, ensure that the integrality of file index.
S102, in the case where meeting trigger condition, execution journal union operation.
Log merges, and mainly passes through meta data server (Metadata Server, MDS) for the institute in a certain period
There is operation to merge, and by the file information persistent storage, while the log of deletion record action event.When log merges
When operation starts, MDS can read dirty (dirty) data in caching first, and list is added in relevant catalogue.Then,
This list is traversed, dirty file (or file) is subjected to persistent storage as unit of catalogue.Finally delete and this time
Corresponding log.
Merged by log, the file operation information in a period of time can be obtained with increment, reduces index construct number,
Efficiently solve can not incremental build file index, time-consuming for building, the problem of low efficiency.Log union operation refers to will be from upper
Secondary log is merged into this log merging, and the log in this period merges.The trigger condition of log union operation can divide
For automatic trigger and triggering manually, automatic trigger are the meta data server when the quantity of log is more than given threshold
(Metadata Server, MDS) can automatic execution journal union operation;Triggering is to receive file operation log merging manually
When order, meta data server execution journal union operation.Log union operation in the present embodiment is to be with the catalogue of file
Unit carries out log union operation.
When carrying out log union operation, first the catalogue fragment (dirfrag) of traversal caching apoplexy involving the solid organs, directory entry, index are saved
The structures such as point, and relative catalogue (dir) is recorded respectively;Then, the catalogue of traversal record obtains recording most in catalogue
The title for the file being closely modified, it is subsequent to construct file operation metadata information to the file that these are modified.
The index node state that meta data server is linked by judgement with file directory item, to determine that this file is to delete
Except still creating, for example, for delete operation, being otherwise that creation operates if the index node being linked with directory entry is sky.
S103, for the file being modified in log union operation, when the type of the file operation of generation is to create/delete
When except operation, file operation metadata information is constructed, and information memory cell is written;When the type of the file operation of generation is to move
When dynamic operation, information memory cell is written into the file operation metadata information having been built up.
The above-mentioned file being modified refers to being created/delete/file crossed of moving operation, in log union operation
When, due to having recorded these files being modified in log, therefore the file being modified in available log union operation
Catalogue, construct file operation metadata information.
Each file operation metadata information first in deposit caching, waits all texts after building is completed in the present embodiment
Part operation metadata information all completes to construct and then information memory cell is written in unification.
When the type of file operation is creation operation, obtain the type of the file operation, file operation generation when
Between, filename, file size, file path, filemodetime, whether need delete with path file index and whether be
File, and construct operation information character string.
Specifically, as shown in figure 5, when creating operation, file operation metadata information is constructed, comprising: first obtain creation
The directory entry of file or folder obtains the index node of file or folder by directory entry, obtains text by index node
Whether the type of part operation filemodetime, file size, is file;Whether the type for judging file operation afterwards is creation
Whether operation then obtains the time of origin of file operation by directory entry, filename, file path, needs if creation operation
Delete with path file index, after judge whether comprising creation object in deleted_files field in bibliographic structure
Title, if marked comprising if delete with path file index mark be true;Deletion is marked to go the same way file index if not including
Mark is false, finally creates operation information character string according to the information architecture of acquisition, and by increasing in file system cache
Add index_create structure, the creation operation information character string of building is recorded in index_create structure for use, knot
Beam;If not creation operation, then directly terminate.
When the type of file operation is delete operation, obtain the type of the file operation, file operation generation when
Between, file path and whether be file, and construct operation information character string.
Specifically, as shown in fig. 6, in delete operation, file operation metadata information is constructed, comprising: first obtain and delete
The directory entry of object obtains the type of file operation according to directory entry, judges whether the type of file operation is delete operation, if
For delete operation, then the time of origin of the file path, file operation of deleting object is obtained according to directory entry and whether be text
Part, according to the information architecture delete operation message character string of acquisition, and by increasing index_ in file system cache
The delete operation message character string of building is recorded in index_unlink structure for use, terminates by unlink structure;If not
Delete operation then directly terminates.
After the completion of all file operation metadata information buildings, the index_move in file system cache is tied
Information memory cell is written in file operation metadata information in structure, index_create structure and index_unlink structure.
Since log union operation is merged as unit of catalogue, so being deposited by file operation metadata information write-in information
When storage unit, and be written as unit of catalogue, by the file operation metadata information under each catalogue, write-in one is right
As after completing write-in, by index_move structure, the index_create structure of all record file operation metadata informations
It is emptied with index_unlink structure.
File operation metadata information in S104, reading information memory cell.
The content of all objects in the index data pond in information memory cell is read, and is believed by file operation metadata
The time of origin of file operation is ranked up in breath.Specifically, whether there is object in timing query information storage unit, if there is
Object then reads the content of all objects, after the content reading for completing all objects, first converts thereof into readable format, so
Afterwards by the time of origin event ordering of file operation in file operation metadata information.
S105, the file operation metadata information read is parsed, according to file operation resulting after parsing
Type executes corresponding file index operation.
Specifically, as shown in figure 8, when the type of file operation is creation operation, corresponding file index operation is executed,
It include: first to judge whether to need to delete with path file index, if necessary to delete, then the path first deleted in indexed set is identical
File index, it is on the contrary then without operation;Then file is judged whether it is again, if it is file, is then constructed and is executed file index
Creation operation, if not file, then terminates.
As shown in figure 9, executing corresponding file index operation, comprising: first when the type of file operation is delete operation
Judge to delete whether object is file, if it is file, then constructs and execute and delete file index operation;If it is file,
It then constructs and executes the file index delete operation under this document folder.
As shown in Figure 10, when the type of file operation is moving operation, corresponding file index operation is executed, comprising:
First judge whether mobile object is file, if it is file, then first constructs and execute file index delete operation under source path, so
After construct and execute under destination path file index creation and operate, terminate;If it is file, then this file road is first retrieved
File index under diameter takes out file path, modification time, file size, the value of filename in source file index, then will be literary
Part path is updated to destination path by source path, and filename, modification time and file size are constant, constructs and executes destination path
Lower file index creation operation, then constructs and executes file index delete operation under source path, terminate.
In the present embodiment, the readable format changed into S104 step is corresponding to the parsing of data format with S105 step,
And it is all made of the prior art, it is no longer repeated herein.
S106, the processed object after the completion of the execution of All Files index operation, in deletion information memory cell.
The present embodiment further includes a kind of distributed file system indexing unit merged based on log, the distributed document
System is made of the more storage arrays that performance is good, capacity is big, such as monitor, meta data server, storage unit.Client
It is connect by network with distributed file system, sends file operation requests to distributed file system, meta data server connects
After receiving request, the specific implementation of file operation is executed, finally returns to operating result to client.
As shown in figure 11, the distributed file system indexing unit 10 merged based on log, including update building and write
Enter unit module 11, information memory cell module 12 and parsing building performance unit module 13, in which:
Update building writing unit module 11 is described in further detail in conjunction with Figure 12: updating building writing unit module 11
For recording file operation information, and log is written, the file operation information includes file operation when file operation occurs
Type and file operation time of origin, and the type of file operation be moving operation when, recording file operation
Information and after log is written, constructs file operation metadata information immediately;
And in the case where meeting trigger condition, execution journal union operation;For the text being modified in log union operation
Part constructs file operation metadata information, and information is written when the type of the file operation of generation is creation/delete operation
Storage unit module 12;When the type of the file operation of generation is moving operation, the file operation metadata that will have been built up
Information memory cell module 12 is written in information;
The information memory cell module 12 updates file behaviour constructed in building writing unit module 11 for storing
Make metadata information;
The parsing constructs performance unit module 13, for reading the file operation member number in information memory cell module 12
It is believed that breath;The file operation metadata information read is parsed, according to the type of file operation resulting after parsing, is held
The corresponding file index operation of row;After the completion of the execution of All Files index operation, delete in information memory cell module 12
Processed object.
The distributed file system indexing unit merged based on log of the present embodiment is divided with above-mentioned based on what log merged
Cloth file system indexing means are corresponding, and the concrete operations content about each step is no longer repeated one by one herein.
It should be noted that a kind of implementation of the invention, updates building writing unit module 11 and is arranged in distribution
In the meta data server of file system;Information memory cell module 12 is an independent dress with information storage function
It sets;Parsing building performance unit module 13 is an independent device, is stored in any one storage of distributed file system
On array.Alternatively, may be implemented in other ways.For example, the division of the module or unit, only a kind of logic function
It can divide, there may be another division manner in actual implementation, such as multiple units or components can be combined or be can integrate
To a system, or some features can be ignored or not executed.Another point, shown or discussed mutual coupling or
Direct-coupling or communication connection can be through some interfaces, and the indirect coupling or communication connection of device or unit can be electricity
Property, mechanical or other forms.
The module as illustrated by the separation member may or may not be physically separated, aobvious as module
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that computer disposal
Device (processor) performs all or part of the steps of the method described in the various embodiments of the present invention.And storage medium packet above-mentioned
It includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random
Access Memory), the various media that can store program code such as magnetic or disk.
The above embodiments are merely illustrative of the technical solutions of the present invention rather than is limited, without departing substantially from essence of the invention
In the case where mind and its essence, those skilled in the art make various corresponding changes and change in accordance with the present invention
Shape, but these corresponding changes and modifications all should fall within the scope of protection of the appended claims of the present invention.
Claims (10)
1. it is a kind of based on log merge distributed file system indexing means, which is characterized in that it is described based on log merge
Distributed file system indexing means, comprising:
Step 1: recording file operation information, and log is written, the file operation information includes when file operation occurs
The type of file operation and the time of origin of file operation, and when the type of file operation is moving operation, it is recording
File operation information and after log is written, constructs file operation metadata information immediately;
Step 2: in the case where meeting trigger condition, execution journal union operation;
Step 3: for the file being modified in log union operation, when the type of the file operation of generation is creation/deletion
When operation, file operation metadata information is constructed, and information memory cell is written;When the type of the file operation of generation is movement
When operation, information memory cell is written into the file operation metadata information having been built up;
Step 4: reading the file operation metadata information in information memory cell;
Step 5: being parsed to the file operation metadata information read, according to the class of file operation resulting after parsing
Type executes corresponding file index operation;
Step 6: deleting the processed object in information memory cell after the completion of the execution of All Files index operation.
2. the distributed file system indexing means merged as described in claim 1 based on log, which is characterized in that the note
Record file operation information, comprising:
The type of file operation, file operation are recorded respectively by increasing field in the directory entry structure of the file operated
The index node of deleted file when time of origin and delete operation;
By the title for increasing field record deleted file in the bibliographic structure of the file operated.
3. as described in claim 1 based on log merge distributed file system indexing means, which is characterized in that it is described
Meet under trigger condition, execution journal union operation, comprising:
When the quantity of log is more than given threshold or receives log combine command, log conjunction is carried out as unit of the catalogue of file
And it operates.
4. the distributed file system indexing means merged as described in claim 1 based on log, which is characterized in that the structure
Build file operation metadata information, comprising:
When the type of file operation is creation operation, the type of the file operation, the time of origin of file operation, text are obtained
Whether whether part name file size, file path, filemodetime, need to delete with path file index and be file,
And construct operation information character string;
When the type of file operation is delete operation, the type of the file operation, the time of origin of file operation, text are obtained
Part path and whether be file, and construct operation information character string;
When the type of file operation is moving operation, obtain the source path of the file operation, destination path, file size,
Filename, filemodetime and whether be file, and construct operation information character string.
5. the distributed file system indexing means merged as claimed in claim 4 based on log, which is characterized in that described right
The file operation metadata information read is parsed, and according to the type of file operation resulting after parsing, is executed corresponding
File index operation, comprising:
When the type of file operation is creation operation, first judge whether to need to delete to delete if necessary with path file index
Remove, then first delete indexed set in the identical file index in path, it is on the contrary then without operation;Then file is judged whether it is again, if
It is file, then constructs and execute file index creation operation, if not file, then terminate;
When the type of file operation is delete operation, first judges to delete whether object is file, if it is file, then construct simultaneously
It executes and deletes file index operation;If it is file, then constructs and execute the file index delete operation under this document folder;
When the type of file operation is moving operation, first judge whether mobile object is file, if it is file, is then first constructed
And file index delete operation under source path is executed, then construct and executes file index creation operation under destination path;If
It is file, then first retrieves the file index under this folder path, when takes out file path, modification in source file index
Between, file size, the value of filename, file path is then updated to destination path by source path, filename, modification time and
File size is constant, constructs and executes file index creation operation under destination path, then construct and execute file under source path
Index delete operation.
6. it is a kind of based on log merge distributed file system indexing unit, which is characterized in that it is described based on log merge
Distributed file system indexing unit, including update building writing unit module, information memory cell module and parsing building
Performance unit module, in which:
The update constructs writing unit module, for recording file operation information, and day is written when file operation occurs
Will, the file operation information include the type of file operation and the time of origin of file operation, and in the class of file operation
When type is moving operation, after having recorded file operation information and log is written, file operation metadata information is constructed immediately;And
And in the case where meeting trigger condition, execution journal union operation;For the file being modified in log union operation, when generation
When the type of file operation is creation/delete operation, file operation metadata information is constructed, and information memory cell mould is written
Block;When the type of the file operation of generation is moving operation, information is written into the file operation metadata information having been built up
Storage unit module;
The information memory cell module updates file operation metadata constructed in building writing unit module for storing
Information;
The parsing constructs performance unit module, for reading the file operation metadata information in information memory cell module;
The file operation metadata information read is parsed, according to the type of file operation resulting after parsing, is executed corresponding
File index operation;After the completion of the execution of All Files index operation, it is processed right in information memory cell module to delete
As.
7. as claimed in claim 6 based on log merge distributed file system indexing unit, which is characterized in that it is described more
New building writing unit module records file operation information, performs the following operations when file operation occurs:
Building writing unit module is updated by increasing field record file operation in the directory entry structure of the file operated
Type, file operation time of origin and delete operation when deleted file index node;By in the text operated
Increase field record in the bibliographic structure of part by the title of operation file.
8. as claimed in claim 6 based on log merge distributed file system indexing unit, which is characterized in that it is described more
For new building writing unit module in the case where meeting trigger condition, the concrete operations of execution journal union operation, execution are as follows:
When the quantity of log is more than given threshold/receive log combine command, building writing unit module is updated with file
Catalogue is that unit carries out log union operation.
9. as claimed in claim 6 based on log merge distributed file system indexing unit, which is characterized in that it is described more
New building writing unit module constructs file operation metadata information, performs the following operations:
When the type of file operation be creation operation when, update building writing unit module obtain the file operation type,
Whether the time of origin of file operation filename, file size, file path, filemodetime, needs to delete with path text
Whether part indexes and is file, and constructs operation information character string;
When the type of file operation be delete operation when, update building writing unit module obtain the file operation type,
The time of origin of file operation, file path and whether be file, and construct operation information character string;
When the type of file operation is moving operation, the source road that building writing unit module obtains the file operation is updated
Diameter, destination path, file size, filename, filemodetime and whether be file, and construct operation information character string.
10. the distributed file system indexing unit merged as claimed in claim 9 based on log, which is characterized in that described
Parsing building performance unit module parses the file operation metadata information read, according to file resulting after parsing
The type of operation executes corresponding file index operation, performs the following operations:
When the type of file operation is creation operation, parsing building performance unit module first judges whether to need to delete same path
File index, if necessary to delete, then first delete indexed set in the identical file index in path, it is on the contrary then without operation;Then again
Judge whether it is file, if it is file, then constructs and execute file index creation operation, if not file, then return;
When the type of file operation is delete operation, parsing building performance unit module first judges to delete whether object is text
Part is then constructed and is executed and delete file index operation if it is file;If it is file, then constructs and execute this document folder
Under file index behaviour delete make.
When the type of file operation is moving operation, parsing building performance unit module first judges whether mobile object is text
Part then first constructs and executes file index delete operation under source path if it is file, then construct and execute under destination path
File index creation operation;If it is file, then the file index under this folder path is first retrieved, takes out source file rope
Draw the value of middle file path, modification time, file size, filename, then updates file path for purpose road by source path
Diameter, filename, modification time and file size are constant, construct and execute file index creation operation under destination path, then structure
It builds and executes file index delete operation under source path.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810718623.0A CN108984686B (en) | 2018-07-02 | 2018-07-02 | Distributed file system indexing method and device based on log merging |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810718623.0A CN108984686B (en) | 2018-07-02 | 2018-07-02 | Distributed file system indexing method and device based on log merging |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108984686A true CN108984686A (en) | 2018-12-11 |
CN108984686B CN108984686B (en) | 2021-03-30 |
Family
ID=64536753
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810718623.0A Active CN108984686B (en) | 2018-07-02 | 2018-07-02 | Distributed file system indexing method and device based on log merging |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108984686B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110032496A (en) * | 2019-04-19 | 2019-07-19 | 杭州玳数科技有限公司 | A kind of log collection method and system for supporting diversified log merging |
CN111208946A (en) * | 2020-01-06 | 2020-05-29 | 北京同有飞骥科技股份有限公司 | Data persistence method and system supporting KB-level small file concurrent IO |
CN111427989A (en) * | 2019-01-10 | 2020-07-17 | 北大方正集团有限公司 | Index processing method, index processing system and storage medium for full-text retrieval |
CN111984598A (en) * | 2020-08-20 | 2020-11-24 | 重庆紫光华山智安科技有限公司 | High-performance metadata log file management method, system, medium and terminal |
CN112860649A (en) * | 2021-02-03 | 2021-05-28 | 深圳市木浪云数据有限公司 | Method, device and system for generating index in increment manner |
CN112948327A (en) * | 2021-04-01 | 2021-06-11 | 北京奇艺世纪科技有限公司 | File processing method, system, electronic device and storage medium |
CN113656645A (en) * | 2020-05-12 | 2021-11-16 | 北京字节跳动网络技术有限公司 | Log consumption method and device |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101571827A (en) * | 2008-04-30 | 2009-11-04 | 国际商业机器公司 | Method for saving logs and log system |
CN102483755A (en) * | 2009-06-26 | 2012-05-30 | 森普利维蒂公司 | File system |
CN102622407A (en) * | 2012-01-29 | 2012-08-01 | 广州亦云信息技术有限公司 | Log file operating system and log file management method |
CN102737130A (en) * | 2012-06-21 | 2012-10-17 | 广州从兴电子开发有限公司 | Method and system for processing metadata of hadoop distributed file system (HDFS) |
CN104123300A (en) * | 2013-04-26 | 2014-10-29 | 上海云人信息科技有限公司 | Data distributed storage system and method |
CN104239443A (en) * | 2014-09-01 | 2014-12-24 | 英方软件(上海)有限公司 | Serialization data operation log storage method |
CN106663056A (en) * | 2014-08-28 | 2017-05-10 | 华为技术有限公司 | Metadata index search in file system |
US9767139B1 (en) * | 2014-06-30 | 2017-09-19 | EMC IP Holding Company LLC | End-to-end data integrity in parallel storage systems |
CN108052679A (en) * | 2018-01-04 | 2018-05-18 | 焦点科技股份有限公司 | A kind of Log Analysis System based on HADOOP |
CN108153804A (en) * | 2017-11-17 | 2018-06-12 | 极道科技(北京)有限公司 | A kind of metadata daily record update method of symmetric distributed file system |
US20180181645A1 (en) * | 2015-06-09 | 2018-06-28 | Palantir Technologies Inc. | Systems and methods for indexing and aggregating data records |
-
2018
- 2018-07-02 CN CN201810718623.0A patent/CN108984686B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101571827A (en) * | 2008-04-30 | 2009-11-04 | 国际商业机器公司 | Method for saving logs and log system |
CN102483755A (en) * | 2009-06-26 | 2012-05-30 | 森普利维蒂公司 | File system |
CN102622407A (en) * | 2012-01-29 | 2012-08-01 | 广州亦云信息技术有限公司 | Log file operating system and log file management method |
CN102737130A (en) * | 2012-06-21 | 2012-10-17 | 广州从兴电子开发有限公司 | Method and system for processing metadata of hadoop distributed file system (HDFS) |
CN104123300A (en) * | 2013-04-26 | 2014-10-29 | 上海云人信息科技有限公司 | Data distributed storage system and method |
US9767139B1 (en) * | 2014-06-30 | 2017-09-19 | EMC IP Holding Company LLC | End-to-end data integrity in parallel storage systems |
CN106663056A (en) * | 2014-08-28 | 2017-05-10 | 华为技术有限公司 | Metadata index search in file system |
CN104239443A (en) * | 2014-09-01 | 2014-12-24 | 英方软件(上海)有限公司 | Serialization data operation log storage method |
US20180181645A1 (en) * | 2015-06-09 | 2018-06-28 | Palantir Technologies Inc. | Systems and methods for indexing and aggregating data records |
CN108153804A (en) * | 2017-11-17 | 2018-06-12 | 极道科技(北京)有限公司 | A kind of metadata daily record update method of symmetric distributed file system |
CN108052679A (en) * | 2018-01-04 | 2018-05-18 | 焦点科技股份有限公司 | A kind of Log Analysis System based on HADOOP |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111427989A (en) * | 2019-01-10 | 2020-07-17 | 北大方正集团有限公司 | Index processing method, index processing system and storage medium for full-text retrieval |
CN110032496A (en) * | 2019-04-19 | 2019-07-19 | 杭州玳数科技有限公司 | A kind of log collection method and system for supporting diversified log merging |
CN110032496B (en) * | 2019-04-19 | 2023-10-13 | 杭州玳数科技有限公司 | Log acquisition method and system supporting diversified log merging |
CN111208946A (en) * | 2020-01-06 | 2020-05-29 | 北京同有飞骥科技股份有限公司 | Data persistence method and system supporting KB-level small file concurrent IO |
CN113656645A (en) * | 2020-05-12 | 2021-11-16 | 北京字节跳动网络技术有限公司 | Log consumption method and device |
CN111984598A (en) * | 2020-08-20 | 2020-11-24 | 重庆紫光华山智安科技有限公司 | High-performance metadata log file management method, system, medium and terminal |
CN112860649A (en) * | 2021-02-03 | 2021-05-28 | 深圳市木浪云数据有限公司 | Method, device and system for generating index in increment manner |
CN112948327A (en) * | 2021-04-01 | 2021-06-11 | 北京奇艺世纪科技有限公司 | File processing method, system, electronic device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108984686B (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108984686A (en) | A kind of distributed file system indexing means and device merged based on log | |
CN104731921B (en) | Storage and processing method of the Hadoop distributed file systems for log type small documents | |
JP5656563B2 (en) | Document management system, document management system control method, and program | |
EP3596619B1 (en) | Methods, devices and systems for maintaining consistency of metadata and data across data centers | |
CN106484877B (en) | A kind of document retrieval system based on HDFS | |
CN103765393B (en) | Storage system | |
US8301588B2 (en) | Data storage for file updates | |
CN103262043B (en) | The method and system of the meticulous recovery of performing database from differential backup | |
CN103020315B (en) | A kind of mass small documents storage means based on master-salve distributed file system | |
CN101641695B (en) | Resource access filtering system and database structure for use therewith | |
CN103020204B (en) | A kind of method and its system carrying out multi-dimensional interval query to distributed sequence list | |
CN105787093B (en) | A kind of construction method of the log file system based on LSM-Tree structure | |
CN101866305B (en) | Continuous data protection method and system supporting data inquiry and quick recovery | |
US7418544B2 (en) | Method and system for log structured relational database objects | |
US10013440B1 (en) | Incremental out-of-place updates for index structures | |
CN102184211B (en) | File system, and method and device for retrieving, writing, modifying or deleting file | |
CN106484906B (en) | Distributed object storage system flash-back method and device | |
CN111522880B (en) | Method for improving data read-write performance based on mysql database cluster | |
CN103595797B (en) | Caching method for distributed storage system | |
CN106021031B (en) | A kind of the deletion data reconstruction method and device of BTRFS file system | |
CN101888405A (en) | Cloud computing file system and data processing method | |
CN105912687A (en) | Mass distributed database memory cell | |
CN103678491A (en) | Method based on Hadoop small file optimization and reverse index establishment | |
US11048699B1 (en) | Grand unified file indexing | |
CN110321325A (en) | File inode lookup method, terminal, server, system and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |