CN102929789B

CN102929789B - Record organization method and record organization structure

Info

Publication number: CN102929789B
Application number: CN201210357225.3A
Authority: CN
Inventors: 马照云; 杨浩; 马振杰; 苗艳超; 刘新春; 邵宗有
Original assignee: Dawning Information Industry Beijing Co Ltd
Current assignee: Dawning Information Industry Beijing Co Ltd; Dawning Information Industry Co Ltd
Priority date: 2012-09-21
Filing date: 2012-09-21
Publication date: 2016-06-08
Anticipated expiration: 2032-09-21
Also published as: CN102929789A

Abstract

The present invention discloses record organization method and record organization structure, and wherein, the method comprises: be often kind of record distribution one uniquely mark; Setting up in the disk file of correspondence and record corresponding data file and meta data file with often kind, wherein, the data file of often kind of record is corresponding to a disk file; And make data file only add write operation continuously. By the technical scheme of the present invention, it is possible to improve the performance of meta data server under big pressure and there is good extendability.

Description

Record organization method and record organization structure

Technical field

The present invention relates to the storage system of data server, more specifically, it relates to the record organization method of record and record organization structure.

Background technology

In parallel memory system, often need to record some auxiliary records or index. Such as when after disk failures in order to the data above it are carried out fast quick-recovery, it is necessary to recording and all there is which object on this disk, this just needs to add record when file creates, deletion corresponding record when deleting. In addition, owing to data server is many copies, so also needing to record inconsistent object, corresponding record after having repaired, is deleted. In order to improve metadata performance, deleting and have employed asynchronous mode, that is, be acknowledged client end after putting deleted marker on meta data server, real inode (index node) and the removing of data are by refuse reclamation thread process. In addition, which file is quota function also need to record there occurs the operations such as the owner revises, and sends the relevant quota information of message modification by background thread to data server. Lose to prevent power-off record and then cause part operation not complete, these records need to be recorded to bottom disk file, simultaneously in order to improve reliability further, these records also need to utilize the log mechanism record daily record (also just to reach principal and subordinate standby mutually for these records like this, has equal reliability with metadata) of metadata.

In order to make full use of the existing mechanism of metadata, existing implementation is often kind of corresponding catalogue inode of record, wherein every bar record manages as a dentry item of inode, so only need little extra work just can add record, can also directly utilize when protocol failure dentry entry deletion interface to delete.

But, there is following problem in this kind of implementation:

1) it is unfavorable for expanding: due to the metadata of record and the inode of file system, and the reserved field of inode is limited, will be difficult to add (it is not too desirable for revising the most important structure inode as file system metadata in order to supplementary) when recording metadata and need the attribute increased to exceed restriction;

2) wasting space: owing to each record carries out record as a dentry item, so all attributes of dentry item certainly can be inherited, but and not all attribute to be all that this records necessary;

3) also it is the defect that this kind of mode is the most serious, this kind of implementation can affect the application performance of daily record, the bottleneck of system can be become when metadata pressure ratio is bigger, affect overall performance: the target that metadata profile hash pursues is that dentry item is as far as possible even in each hash bucket, and the recorded amounts of these auxiliary records is very big (because the file of each ParaStor file system can use multiple disk), when expanding hash and be bigger, daily record application can cause magnetic head to carry out flyback, and then greatly reduce application efficiency, will temporarily refuse metadata daily record to a certain extent when daily record pressure heap and submit request to, and then affect the performance of whole metadata.

For the problem in correlation technique, effective solution is not yet proposed at present.

Summary of the invention

For the problem in correlation technique, the present invention proposes a kind of record organization method and record organization structure, it is possible to improves the performance of meta data server and can well expand.

According to an aspect of the present invention, it provides a kind of record organization method, comprising: be often kind of record distribution one uniquely mark; Setting up in the disk file of correspondence and record corresponding data file and meta data file with often kind, wherein, the data file of often kind of record is corresponding to a disk file; And make data file only add write operation continuously.

Preferably, metadata information in meta data file to be uniquely designated index.

Preferably, when a protocol failure in data file, insert one at the end of data file and corresponding with invalid record offset record.

Preferably, when needs use record, data file is recycled, by invalid record with offset record and offset.

More preferably, recession function registered in advance is utilized to recycle.

More preferably, sequential scan data file recycles.

More preferably, recycling comprises: all records that offsets are added a ltsh chain table; Sequential scan data file, for each effective record, searches in ltsh chain table and offsets record; If that finds correspondence offsets record, then from ltsh chain table, extract this offset record; Corresponding record is offseted, then by this effective record write new temporary file if do not found.

Preferably, after having scanned all effective records, replace data file with new temporary file.

Preferably, logic log is only recorded when generating daily record.

According to a further aspect in the invention, provide a kind of record organization structure, this record organization structure comprises the data file corresponding to a kind of record and meta data file, wherein, uniquely identify for each record distribution one and make often kind of data file recorded corresponding to a disk file, and data file is only added write operation continuously.

The present invention is by this kind of simple metadata organization mode, eliminate the index demand of data file, offset record by additional write and realize record deletion, read the access of objfile file being reduced to additional write sequence, improve disk read-write performance, and solve, with logic log, the problem that when the concurrent daily record brought of thread generates, some information be can not determine.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, it is briefly described to the accompanying drawing used required in embodiment below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is the schema of the record organization method according to the present invention;

Fig. 2 is the schematic diagram of data organizational structure according to embodiments of the present invention; And

Fig. 3 illustrates record organization mode according to embodiments of the present invention and offsets the schematic diagram of result.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only the present invention's part embodiment, instead of whole embodiments. Based on the embodiment in the present invention, other embodiments all that those of ordinary skill in the art obtain, all belong to the scope of protection of the invention.

For problems of the prior art, the present inventor proposes a kind of simple record organization mode. It is described in detail referring to Fig. 1 to Fig. 3.

Fig. 1 is the schema of the record organization method according to the present invention.

With reference to Fig. 1, comprise according to the record organization method of the present invention: S102, it is often kind of record distribution one uniquely mark; S104, sets up in the disk file of correspondence and records corresponding data file and meta data file with often kind, and wherein, the data file of often kind of record is corresponding to a disk file; And S106, make data file only add write operation continuously.

Fig. 2 is the schematic diagram of data organizational structure according to embodiments of the present invention.

With reference to Fig. 2, data organizational structure according to the present invention is: identify for often kind of record distribution one is unique (we term it fid, from 1), often kind of corresponding disk file of record is used for depositing physical record, and (file is named with fid, we are called objfile file (data file)), a structure is used for depositing relevant metadata information (such as current record number etc.), metadata information leaves in the file with 0 name (being called objfile meta data file), and metadata information is index taking fid in objfile meta data file.

Fig. 2 illustrate only " the asynchronous deletion of inode: 1 " and the data file of " inconsistent object: 2 " two kinds record correspondence, but those skilled in the art should understand that, it is also possible to there is the data file of other multiple records and correspondence thereof.

In order to simplify processes, when a record A in data file is invalid, immediately it can not be deleted from objfile file, but insert one in end of file and offset record A '.

Here a difficult point is that to write disk file be that the filename according to daily entry and skew amount carry out to log system, it is likely concurrent that objfile record produces thread, and each generation thread may once produce multiple objfile file record, when daily record generates, record skew amount hereof is that uncertain (objfile lock can not be added to whole daily record things life cycle, because this is critical path can reduce metadata performance, the daily record first generated not necessarily first is submitted to). For this problem, the present inventor proposes a kind of terms of settlement, namely logic log is only recorded when daily record generates, and logic log expansion thread is unique (log system is encountered logic log meeting calling logic daily record readjustment function and launched).

When needs use record, first objfile file is reclaimed, A and A ' is offseted, only stay effectively record for.

In order to improve extensibility, objfile supports recession function registration, and all kinds of record according to specific needs specific recession function registered in advance, can use the recession function of registration when recycling.

In addition, it is possible to use default behavior reclaims. Acquiescence way of recycling is sequential scan objfile file, record will be offseted and add a hash chain table, sequential scan file again after having scanned, effective record is only paid close attention in this scanning, for each effective record, first search at hash chain table and offset record, if finding, record will be offseted and extract from hash chain table, otherwise by a record write new temporary file, this substitutes objfile file with temporary file after having scanned. Objfile file record organizational form and offset result and figure 3 illustrates.

By adopting above-mentioned organizational form, it is possible to realize following advantage:

1) owing to only having continuous print to add write objfile file, and record organization mode is simple, hash bucket will be caused to expand due to adding of new record, and then resettlement has write record. So drastically increasing daily record application efficiency, carry out improve the performance of meta data server under big pressure.

2) not having useless attribute write disk file, save space compared with original mode, objfile metadata structure and interrecord structure are established for this purpose specially, so not having redundant information.

3) there is good extendability, not only it is embodied in objfile dependency structure member's interpolation/deletion, and when there being new auxiliary record type to produce, it is possible to use this framework to expand very easily, and specific requirement can be reached by registration function.

In sum, by means of the technique scheme of the present invention, by proposing simple objfile metadata organization mode, for often kind of record specific fid of distribution, the index in objfile meta data file using fid as associated metadata, eliminates the index demand of objfile file record, offsets record by additional write and realizes record deletion, thus read the access of objfile file being reduced to additional write sequence, it is to increase disk read-write performance. In addition, solve, with logic log, the problem that when the concurrent daily record brought of thread generates, some information be can not determine.

The foregoing is only the better embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment of doing, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. a record organization method, it is characterised in that, described method comprises:

For often kind of record distribution one uniquely mark;

Setting up in the disk file of correspondence and record corresponding data file and meta data file with often kind, described data file comprises many records, and wherein, the data file of often kind of record is corresponding to a disk file; And

When a protocol failure in described data file, insert one at the end of described data file and corresponding with invalid record offset record, make described data file only add write operation continuously.

2. method according to claim 1, it is characterised in that, metadata information is uniquely designated index with described in described meta data file.

3. method according to claim 1, it is characterised in that, when needs use record, described data file is recycled, described invalid record and the described record that offsets are offseted.

4. method according to claim 3, it is characterised in that, utilize recession function registered in advance to carry out described recycling.

5. method according to claim 3, it is characterised in that, data file described in sequential scan carries out described recycling.

6. method according to claim 5, it is characterised in that, described recycling comprises:

All records that offsets are added a ltsh chain table;

Data file described in sequential scan, for each effective record, searches in described ltsh chain table and offsets record;

If that finds correspondence offsets record, then from described ltsh chain table, extract this and offset record;

Corresponding record is offseted, then by this effective record write new temporary file if do not found.

7. method according to claim 6, it is characterised in that, after having scanned all effective records, replace described data file with described new temporary file.

8. method according to any one of claim 1 to 7, it is characterised in that, only record logic log when generating daily record.

9. a record organization device, it is characterised in that, described record organization device comprises for corresponding to the data file of a kind of record and the device of meta data file, described data file comprises many records; For distributing a unique mark for each record and make often kind of data file recorded corresponding to the device of a disk file; And when a protocol failure in described data file, for inserting the device that offsets record corresponding with invalid record at the end of described data file, and described data file is only added write operation continuously.