CN103793475A

CN103793475A - Distributed file system data migration method

Info

Publication number: CN103793475A
Application number: CN201410005142.7A
Authority: CN
Inventors: 郭照斌; 季旻; 姜国梁; 马振杰; 杨鹏
Original assignee: WUXI CITY CLOUD COMPUTER CENTER CO Ltd
Current assignee: WUXI CITY CLOUD COMPUTER CENTER CO Ltd
Priority date: 2014-01-06
Filing date: 2014-01-06
Publication date: 2014-05-14
Anticipated expiration: 2034-01-06
Also published as: CN103793475B

Abstract

The invention discloses a distributed file system data migration method. The method includes: during data migration, selecting a frequently modified or written-in file as a migration source file, directly writing modified or newly written-in data into a to-be-migrated destination node for a file being migrated, creating indexes for new data on the basis of original data, and remigrating unmodified data. Compared with a conventional cold data migration method, the method has the advantages that time for load balancing can be reduced greatly, a large amount of network io (input output) and disk io can be saved, and quick balancing of all node data loads is achieved.

Description

A kind of method of distributed file system Data Migration

Technical field

The present invention relates to computer realm, be specifically related to a kind of method of distributed file system Data Migration.

Background technology

Distributed file system generally comprises client, meta data server and data server, and client is responsible for the access interface of file data and is formulated, and meta data server is processed layout and the base attribute of file, the data content of data server storage file.

Load between each data serving node, the capacity equilibrium usually performance to whole system and stability has a great impact, and online dilatation, also add the feature that new node is again a distributed file system indispensability, and the interpolation of new node must cause unbalanced in capacity and load of whole distributed file system the old and new's node, and Data Migration is the common method addressing this problem.

Traditional Data Migration, the source file of selecting is the file of asking without frequentation, to reaching interfering with each other of normally writing and move, but this method equilibrium is got up slow, and the file moving is modified or write operation meeting causes moving unsuccessfully, thereby the data of having moved before causing take the invalid network bandwidth and disk io.

Summary of the invention

For the deficiencies in the prior art, the object of this invention is to provide a kind of method of distributed file system Data Migration, the present invention proposes and selecting the file of often access is source file, can reach balanced fast, and can not cause the waste of the network bandwidth and disk io.

The object of the invention is to adopt following technical proposals to realize:

The invention provides a kind of method of distributed file system Data Migration, its improvements are, described method comprises: when distributed file system Data Migration, the distributed document that the source file of migration is chosen as frequent modification or writes, for the distributed document moving, by its modification or the data that newly write are directly write in destination node to be migrated, on legacy data, set up the index of new data, unmodified data are moved again;

Described method comprises the steps:

(1), by the modification to distributed document or write-access number of times statistics, determine that the high distributed document of the access frequency is the source file of migration;

(2) in the time having data to write source file, client is obtained after layout information to meta data server, sends to the source node (node at the placement position place of source file is source node, source file can corresponding multiple source nodes) of appointment;

(3) source node creates index node in the destination node of migration, then forwards the data on index node;

(4) index node completes after data write and returns to source node, and source node is revised index record;

(5) source node returns to client, completes and writes, and be equivalent to the migration work of this blocks of data;

(6) background controller moves the content not writing, and copies data from source node and writes destination node, and record the index record writing;

(7) when the content on source node all moves to after index node, notice meta data server revised file layout information, (native object is file data content corresponding on source node to delete native object, native object all can be replaced with to the file data content that source node is corresponding), so far distributed document Data Migration is complete.

Further, described modification or the data that newly write directly write to and in destination node to be migrated, comprise following manner:

Mode 1: when data arrive source node, be directly forwarded in destination node to be migrated by source node;

Mode 2: write and fashionablely directly write in destination node to be migrated by client, the then object of notification source node.

Further, the index of setting up new data on described legacy data comprises: the index relative of setting up source node and destination node by bitmap file bitmap, array or tree construction;

The relation that records source node and destination node by the corresponding 1bit of the 4K of minimum operation unit of client, often writes once, and bitmap file bitmap, array or the tree construction of correspondence skew place are set to 1.

Further, in described step (3), source node checks that the content that reads is whether on index node, if read the content on index node; If not, directly reading local content returns.

Compared with the prior art, the beneficial effect that the present invention reaches is:

The method of distributed file system Data Migration provided by the invention, it is source file that the method is selected the file of often access, can reach balanced fast, and can not cause the waste of the network bandwidth and disk io.When Data Migration, the source file file that is chosen as frequent modification or writes of migration, for the file moving, will directly write in destination node to be migrated its modification or the data that newly write, on legacy data, set up the index of new data, and unmodified data are moved again.

Accompanying drawing explanation

Fig. 1 is the process flow diagram that Data Migration provided by the invention writes.

Embodiment

Below in conjunction with accompanying drawing, the specific embodiment of the present invention is described in further detail.

The invention provides a kind of method of distributed file system Data Migration, described method comprises: when distributed file system Data Migration, the distributed document that the source file of migration is chosen as frequent modification or writes, for the distributed document moving, to its modification or the data that newly write be directly write in destination node to be migrated, on legacy data, set up the index of new data, unmodified data are moved again;

The process flow diagram that Data Migration writes as shown in Figure 1, comprises the steps:

(2) in the time having data to write source file, client is obtained after layout information to meta data server, sends to the source node (what relation source file and source node are) of appointment;

(7) when the content on source node all moves to after index node, notice meta data server revised file layout information, what (does native object refer to delete native object?), so far distributed document Data Migration is complete.

Revise or the data that newly write directly write to and in destination node to be migrated, comprise following manner:

The index of setting up new data on legacy data comprises: the index relative of setting up source node and destination node by bitmap file bitmap, array or tree construction.

A. the recording method of data directory:

Can pass through the form record of bitmap file bitmap, record the relation of source object and object object by the corresponding 1bit of the 4K of minimum operation unit of client, often write once, the bitmap of correspondence skew place is set to 1.

B. how be set forth in data in transition process below is normally had access to by client:

<1>, in the time that client needs file reading, after meta data server obtains layout, sends to the source node of appointment

<2> source node checks that the content that reads whether in index object, if read the content in index object, if not, directly reads local content and returns.

The present invention can greatly reduce the time of load balancing with respect to traditional cold Data Migration, and saves a large amount of network io and disk io, reaches the fast uniform of each node data load.

Finally should be noted that: above embodiment is only in order to illustrate that technical scheme of the present invention is not intended to limit, although the present invention is had been described in detail with reference to above-described embodiment, those of ordinary skill in the field are to be understood that: still can modify or be equal to replacement the specific embodiment of the present invention, and do not depart from any modification of spirit and scope of the invention or be equal to replacement, it all should be encompassed in the middle of claim scope of the present invention.

Claims

1. the method for a distributed file system Data Migration, it is characterized in that, described method comprises: when distributed file system Data Migration, the distributed document that the source file of migration is chosen as frequent modification or writes, for the distributed document moving, by its modification or the data that newly write are directly write in destination node to be migrated, on legacy data, set up the index of new data, unmodified data are moved again;

Described method comprises the steps:

(2) in the time having data to write source file, client is obtained after layout information to meta data server, sends to the source node of appointment;

(7) when the content on source node all moves to after index node, notice meta data server revised file layout information, deletes native object, and so far distributed document Data Migration is complete.

2. the method for claim 1, is characterized in that, described modification or the data that newly write directly write to and in destination node to be migrated, comprise following manner:

3. the method for claim 1, is characterized in that, the index of setting up new data on described legacy data comprises: the index relative of setting up source node and destination node by bitmap file bitmap, array or tree construction;

4. the method for claim 1, is characterized in that, in described step (3), source node checks that the content that reads is whether on index node, if read the content on index node; If not, directly reading local content returns.