CN103077219A

CN103077219A - Method and device for automatically storing data

Info

Publication number: CN103077219A
Application number: CN2012105894919A
Authority: CN
Inventors: 张森林; 冯圣中
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2012-12-29
Filing date: 2012-12-29
Publication date: 2013-05-01

Abstract

The invention is applied to the field of internet communication and provides a device for automatically storing data. The device comprises a data identifying module, a node identifying module, a matching module and an adjusting module, wherein the data identifying module is used for identifying specific data; the node identifying module is used for identifying a high-performance node in a cluster; the matching module is used for storing the specific data on the high-performance node; and the adjusting module is used for periodically checking the position of the specific data, wherein when the position of the specific data is not on the high-performance node, the matching module is started. The specific data is always stored on the high-performance node, and a high-quality storage medium is used when a system processes a specific data access request, so that the specific data is enabled to have the lower access delay.

Description

A kind of data are placed method and apparatus automatically

Technical field

The invention belongs to field of Internet communication, relate in particular to a kind of data and automatically place method and apparatus.

Background technology

Along with the explosive increase of data, the cluster of storage mass data arises at the historic moment.Because the data volume of storing in the cluster is very large, what have reaches the PB level, and synchronization may have a large amount of data accesses.Therefore, whether the deposit position of data is reasonable, directly has influence on the access delay of data.

Access delay refers to the user and improves data access request and receive the needed time of data to the user.In cluster, data volume is huge, and the server that therefore is used for the storage data may have a lot, may run into the very heavy situation of certain server node load during user's request msg, therefore just needs queuing, can cause larger access delay.Solve the method for access delay, the load balancing method is arranged, soon the load balancing in the cluster guarantees not have the bottleneck that certain node becomes service on each node; The user many copies method arranged, namely when some data access frequency is higher, just makes the copies of several these data more, so that can select load server light, close together to come the transmission of data from a plurality of copies.Some has adopted the method for classification storage, active low volume data collection is deposited on the good node of access performance, so that overall access is optimum.

But many times, the access frequency of some data is not very high, but but very important.Namely often not accessed, but accessed the time, the real-time of requirement is higher.This data if process with hierarchical stor, are difficult to prove effective, because the foundation of hierarchical stor grouped data mainly is according to its active degree and moving costs etc.Simply construct many copies or keep load balancing, can not guarantee that also particular data has lower access delay.

In view of the server node in the cluster may be in the situation that access performance differs under many circumstances, and the access characteristics of data is inconsistent, therefore be necessary to adopt a kind of more rational data placement strategy, so that particular data is always deposited on the high node of performance, what the operation of system when processing the special data access request used is the high-quality storage medium, thereby guarantees that particular data has lower access delay.

Summary of the invention

The embodiment of the invention provides a kind of data automatically to place method and apparatus, be intended to solve current cluster when the access characteristics of node visit performance difference and data is inconsistent, particular data is always deposited on the high node of performance, caused to guarantee that particular data has lower access delay.

For this reason, the embodiment of the invention provides following technical scheme:

The automatic laying method of a kind of data may further comprise the steps:

S101: identification particular data;

S102: the high-performance node in the identification cluster;

S103: described particular data is positioned on the described high-performance node;

S104: make regular check on the position of described particular data, when the position of described particular data is not on described high-performance node, repeating step S103.

The embodiment of the invention also provides a kind of data automatic apparatus for placing, comprising:

Data identification module is used for the identification particular data;

The node recognition module is for the high-performance node of identification cluster;

Matching module is used for described particular data is positioned over described high-performance node;

Adjusting module is used for making regular check on the position of described particular data, when the position of described particular data is not on described high-performance node, starts matching module.

Compared with prior art, embodiments of the invention have following advantage:

The embodiment of the invention is by the identification particular data, and the high-performance node in the identification cluster, then described particular data is positioned on the described high-performance node, and make regular check on the position of described particular data, thereby keep particular data is deposited on the good node of access performance, so that particular data has lower access delay.

Description of drawings

Fig. 1 is the method flow diagram of the automatic laying method of data that provides of first embodiment of the invention;

Fig. 2 is the structural drawing of the automatic apparatus for placing of data that provides of second embodiment of the invention.

Embodiment

In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that described herein only is a part of embodiment of the present invention, rather than whole embodiment.Based on the embodiment among the present invention, the every other embodiment that those of ordinary skills obtain under the prerequisite of not making creative work belongs to the scope of protection of the invention.

Fig. 1 is the method flow diagram of the automatic laying method of data that provides of first embodiment of the invention, for convenience of explanation, only shows the part relevant with the embodiment of the invention.

As shown in Figure 1, the method may further comprise the steps:

Step S101, the identification particular data.

Concrete, the reading system document according to the feature that writes in advance the described particular data in the described system documentation, is identified described particular data.

Preferably, the feature of described particular data is by manually establishing in advance the writing system document.

Step S102, the high-performance node in the identification cluster.

Concrete, by the host name identification feature, identify described high-performance node.

Step S103 is positioned over described particular data on the described high-performance node.

Concrete, described particular data is positioned on the described high-performance node, when described high-performance node storage space is not enough, the inactive data of the part of moving out.

Step S104 makes regular check on the position of described particular data, when the position of described particular data is not on described high-performance node, and repeating step S103.

Concrete, when in the storage system Data Migration and data backup occuring, check the position of described particular data.

Concrete, in hierarchical stor, because the temperature of particular data is not high, in the time of can moving on the lower node of performance, when selecting migrating objects, particular data is screened.

Concrete, the Data Migration that the factor data backup causes, the node at particular data place may lose efficacy or withdraw from, so that the position of particular data changes, then repeating step S103 is to step S104.

Based on identical design, second embodiment of the invention provides a kind of data automatic apparatus for placing, and as shown in Figure 2, this device comprises:

Data identification module 201 is used for the identification particular data.

Concrete, described data identification module reading system document according to the feature that writes in advance the described particular data in the described system documentation, is identified described particular data.

Node recognition module 202 is for the high-performance node of identification cluster.

Concrete, described node recognition module by the host name identification feature, is identified described high-performance node.

Matching module 203 is used for described particular data is positioned over described high-performance node.

Concrete, described matching module 203 when described high-performance node storage space deficiency, the inactive data of the part of moving out.

Adjusting module 204 is used for making regular check on the position of described particular data, when the position of described particular data is not on described high-performance node, starts matching module 203.

It will be appreciated by those skilled in the art that the module in the device among the embodiment can be distributed in the device of embodiment according to the embodiment description, also can carry out respective change and be arranged in the one or more devices that are different from the present embodiment.The module of above-described embodiment can be merged into a module, also can further split into a plurality of submodules.

Through the above description of the embodiments, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform, can certainly pass through hardware, but the former is better embodiment in a lot of situation.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product is stored in the storage medium, comprise that some instructions are with so that a station terminal equipment (can be mobile phone, personal computer, server, the perhaps network equipment etc.) carry out the described method of each embodiment of the present invention.

The above only is preferred implementation of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be looked protection scope of the present invention.

Claims

1. a data placement method is characterized in that, said method comprising the steps of:

S101, identification particular data;

High-performance node in S102, the identification cluster;

S103, described particular data is positioned on the described high-performance node;

S104, make regular check on the position of described particular data, when the position of described particular data is not on described high-performance node, repeating step C.

2. data placement method as claimed in claim 1 is characterized in that, described step S101 may further comprise the steps:

The reading system document according to the feature that writes in advance the described particular data in the described system documentation, is identified described particular data.

3. data placement method as claimed in claim 1 or 2 is characterized in that, described step S102 may further comprise the steps: by the host name identification feature, identify described high-performance node.

4. data placement method as claimed in claim 1 or 2 is characterized in that, described step S103 may further comprise the steps: when described high-performance node storage space is not enough, and the inactive data of the part of moving out.

5. data placement method as claimed in claim 1 or 2, it is characterized in that, described step S104 may further comprise the steps: when in the storage system Data Migration and data backup occuring, check the position of described particular data, when the position of described particular data is not on described high-performance node, repeating step S103.

6. a data placement device is characterized in that, described device comprises:

Data identification module is used for the identification particular data;

7. data placement device as claimed in claim 6 is characterized in that, described data identification module reading system document according to the feature that writes in advance the described particular data in the described system documentation, is identified described particular data.

8. such as claim 6 or 7 described data placement devices, it is characterized in that, described node recognition module by the host name identification feature, is identified described high-performance node.

9. such as claim 6 or 7 described data placement devices, it is characterized in that, described matching module when described high-performance node storage space is not enough, the inactive data of the part of moving out.