CN103106044B

CN103106044B - Classification storage power-economizing method

Info

Publication number: CN103106044B
Application number: CN201210539442.4A
Authority: CN
Inventors: 张森林; 冯圣中
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2012-12-13
Filing date: 2012-12-13
Publication date: 2016-09-07
Anticipated expiration: 2032-12-13
Also published as: CN103106044A

Abstract

The present invention provides a kind of classification storage power-economizing method, said method comprising the steps of: storage automatic classification: cluster starts, and automatically identifies the storage hierarchy at each main frame, and is proportionally energy saver mode by knot adjustment low for storage hierarchy；Orientation access: the accumulation layer that chosen distance is near, accumulation layer is second highest stores and reads file；Searching dsc data: the access information of each data block in log file, according to described record information, draws the value of each access data block, forms queue from high to low according to being worth.The classification storage method of the present invention ensure that the energy-conservation of cluster.

Description

Classification storage power-economizing method

Technical field

The present invention relates to the memory technology of a kind of computer realm, particularly relate to a kind of classification storage power-economizing method.

Background technology

Along with the explosive growth of data volume, the server cluster storing and processing mass data is more and more universal. The energy consumption problem of these server clusters, increasingly causes the concern of people.

According to statistics, in the cost building a server cluster, the only electric power of server and cooling system disappears Consumption just occupies 20%, and major part server when majority all in low-load state, be generally not higher than 30%, cause the biggest power wastage.In order to reduce the unnecessary damage that this power wastage brings as possible Losing, cluster power-saving technology is arisen at the historic moment.

The power-saving technology of current cluster, its key point is will to operate in individual server in the task-set in cluster On, and other servers are adjusted to power save mode or turn off, thus reach the purpose that cluster is energy-conservation.

The foothold of these cluster power-saving technologies current is, in cluster, the access of data is to disperse and unfixed, This is relevant with the distribution of data in whole cluster.Present server cluster, the most all achieves load balancing Technology so that the data in cluster can mean allocation on the server, prevent individual server overload and it The situation that his server is idle, to reach the purpose of concurrent processing.

But industrial research shows, the data of only 20% are active, and remain the data of 80% and be in not Active state, and the activity of these data also can time to time change.So even be that cluster has reached negative Carry equilibrium, but the access characteristics being because data is inconsistent, be bound to that individual server load weight occurs, and The situation that remaining server load is light.

Current this cluster power-saving technology, is by load centralization so that whole cluster is in again load in fact Unbalanced state, the node that then would sit idle for is adjusted to power save mode.This way, is load balancing in fact Inverse process.Though temporarily solving subproblem, but also pay cost, such as to node each in cluster Load is monitored, and needs the instruments such as sensor, adds again departmental cost.

Thus, the server utilization rate in cluster is low, wastes electric energy in a large number, is in fact in whole cluster Carry out the inevitable outcome that load-balancing technique brings.If but do not realize load balancing, may be such that collection Individual server in Qun becomes access bottleneck.Therefore, the problem of cluster power consumption to be solved, ensure collection again Individual server in Qun will not become access bottleneck, it is necessary to a brand-new data configuration mode.

Summary of the invention

The present invention solves above-mentioned technical problem, it is provided that the classification storage that a kind of low cost, automaticity are high Power-economizing method, said method comprising the steps of:

Storage automatic classification: cluster starts, and utilizes the storage hierarchy at the host name each main frame of identification identification, And be proportionally energy saver mode by knot adjustment low for storage hierarchy；

Orientation access: chosen distance is near, accumulation layer is second highest, the accumulation layer of normal mode of operation stores and reads File；

Searching dsc data: the access information of each data block in log file, it is judged that migrate opportunity, when migrating Machine arrive time, according to described record information, draw the value of each access data block, according to be worth from height to Low formation queue；

Data block migration: by accumulation layer second highest for costly data block migration to accumulation layer, low by being worth Data block migration is to the low accumulation layer of storage hierarchy.

Preferably, described method also includes: self-adaptative adjustment: after Data Migration completes, and more new data block is visited Ask information, restart monitoring.

Preferably, processing described record information by information Valuation Modelling, described data block accesses information and includes Access user, access time and data block information.

Preferably, by queue filtering model and route matching model, obtain after information Valuation Modelling processes Data block value queue on the basis of, form concrete data migration task, utilize and migrate Controlling model and complete Data Migration.

Preferably, described queue filtering model is: fall the data sectional that need not migrate according to threshold filtering, All data sectionals in the queue formed after filtration are all it has been determined that migratory direction, and threshold value reflects this storage Previous migration results on level.

Preferably, described route matching model is: in queue all of piece all determine migratory direction after, Determine migration source close together and migrate target, migration source prioritizing selection remaining space is less, load is light, The node of normal mode of operation, migrates target priority and selects to load light node.

Preferably, described migration Controlling model is: carry out migration rate control, uses multithreading to hold in batches The described data migration task of row, reduces the transition process impact on cluster interior joint access performance.

Preferably, described renewal data block information, restart the step of monitoring particularly as follows:

The valuation result of storage data block, in case using during valuation next time；

For the data block being deleted, delete in the access record that system is retained；

The threshold value carrying out each storage hierarchy according to the actual conditions migrated updates；

Awaking monitoring process, waits the arrival of Data Migration next time.

Preferably, when storing automatic classification, described storage hierarchy at least includes 2 grades, drawing of storage hierarchy Minute mark standard is: storage hierarchy is the highest, and access performance is the best, and the response time processing user's request is the shortest.

Preferably, the secondary storage layer of 40% and the tertiary storage layer of 60% are adjusted to energy saver mode.

The classification storage power-economizing method of the present invention realizes classification memory technology at cluster, uses the side of classification storage Method, uses level storage medium in the cluster, is fixed on accessing focus in the storage of higher level, according to Knot adjustment low for storage hierarchy is energy saver mode by ratio, it is ensured that cluster energy-conservation has also saved cost.

Accompanying drawing explanation

Fig. 1 is that one embodiment of the invention classification stores power-economizing method schematic flow sheet.

Detailed description of the invention

Below in conjunction with accompanying drawing and specific embodiment, the present invention is described in further detail.

As it is shown in figure 1, store power-economizing method schematic flow sheet for one embodiment of the invention classification, the present invention divides The method of level storage comprises the following steps:

Step S1: storage automatic classification.

Cluster starts, and utilizes the storage hierarchy that each main frame of host name identification comprises accumulation layer, and proportionally will The knot adjustment that storage hierarchy is low is energy saver mode, in the present embodiment, when hadoop cluster starts, by " main Machine name label method ", system can identify the access performance of each node automatically.In the present embodiment, by 40% Secondary storage layer and 60% tertiary storage layer be adjusted to energy saver mode；Certainly, in other embodiments, deposit Reservoir number and be adjusted to the ratio of energy saver mode and can arbitrarily regulate, come under the scope of this patent protection.

Step S2: orientation access.

Chosen distance is near, accumulation layer is second highest, the accumulation layer of normal mode of operation stores and reads file.

Step S3: find dsc data.

The access information of each data block in log file, it is judged that migrate opportunity, when migration arrives opportunity, root According to described record information, draw the value of each access data block, form queue from high to low according to being worth, In the present embodiment, the node in cluster is divided into 3 different storage hierarchys, and storage hierarchy is the highest, configuration Hard disk access performance the best, capacity is the least, and price is the most expensive.Therefore can only be deposited by a small amount of data On the node that storage hierarchy is the highest.Under normal circumstances, the most a small amount of number in all data in a cluster According to being accessed frequently.We, by the access information of log file, process these letters by information Valuation Modelling Breath, draws values, this value the biggest, represent this data access frequently, storage hierarchy is just This is the highest；Client is to the reading of file in units of block, and system is all remembered each read operation of block Record is got off, and the content of record includes: accesses user, access time and data block information etc., often reads one Subsystem will generate a record.In particular moment, information Valuation Modelling is used to process these records, mould The process of type to as if block, the parameter used has: access time, access times, number of users, block big Little, block and the degree of association of other blocks, the history value etc. of block, utilize formula to calculate specific value, weigh " hot " degree of block, and form queue, the block after information Valuation Modelling preliminary treatment from high to low according to being worth Value queue, Data Migration algorithm utilizes queue filtering model, route matching model, forms concrete migration and appoints Business, finally utilizes migration Controlling model to complete final Data Migration；Queue filtering model is by each accumulation layer Threshold value on secondary, filters out the data block without migrating.These threshold value records be all under move data block Maximum and all on move the minimum of a value of data block.After filtration formed queue in all pieces all it has been determined that Migratory direction, in other embodiments, when storing automatic classification, described storage hierarchy at least includes 2 grades, The criteria for classifying of storage hierarchy is: storage hierarchy is the highest, and access performance is the best, processes the response of user's request Time is the shortest.

Step S4: data block migration.

By accumulation layer second highest for costly data block migration to accumulation layer, arrive being worth low data block migration The accumulation layer that storage hierarchy is low, in queue all of piece all determine migratory direction after, it is thus necessary to determine that migrate Source and target.Migration source prioritizing selection remaining space is less, and load is light, the node of normal mode of operation, If the node space of normal mode of operation is not enough, then uses the node of energy saver mode to be automatically upgraded to normally and work Pattern, migrates target and needs enough spaces to accommodate migration block, the node that prioritizing selection load is light.With Time to migrate source enough near with the distance migrating target, in queue, all of piece has had concrete migration source With when migrating target, it is the formation of concrete migration task.Controlling model uses multithreading to perform this in batches Migrating task, as every batch only has 50 threads for migrating, and each node at most has 5 threads For performing migration task so that transition process is the least on the impact of cluster interior joint access performance.

Step S5: self-adaptative adjustment.

After Data Migration completes, more new data block accesses information, restarts monitoring, in the present embodiment, and root The migration cycle is adjusted in time according to the trigger condition migrated.Described renewal data block information, restarts monitoring Step particularly as follows:

Awaking monitoring process, waits the arrival of Data Migration next time.

Transition process may there is some be in the node of energy saver mode (being positioned on secondary storage and tertiary storage) Become normal mode of operation, show this grade of storage is in the node remaining space of normal mode of operation the most not Foot.According to the principle of locality of data access, then do not access record by load weight and in continuous 2 cycles Node, be set to energy saver mode, and the node being partially in energy saver mode be set to normal mode of operation, protect Demonstrate,prove the free space of this grade of storage in more than the 10% of this grade of storage total capacity.

After step s 5, returning and perform step S2, the process circulation of data dispatch is carried out.

The method that the classification storage power-economizing method of the present invention uses classification storage, uses layer in hadoop cluster Secondary storage medium, is fixed on accessing focus in the storage of higher level, so there is no need to carry out task Migrate, only the memory node of low level need to be in power save mode.This ensure that the energy-conservation of cluster, The individual server in cluster can be made again will not to become the bottleneck of access, kill two birds with one stone.

It is understood that for the person of ordinary skill of the art, can be according to the technology of the present invention Other various corresponding changes and deformation are made in design, and all these change all should belong to the present invention with deformation Scope of the claims.

Claims

1. a classification storage power-economizing method, it is characterised in that said method comprising the steps of:

Storage automatic classification: hadoop cluster starts, and utilizes the accumulation layer at each main frame of host name identification Secondary, and be proportionally energy saver mode by knot adjustment low for storage hierarchy；

Searching dsc data: the access information of each data block in log file, it is judged that migrate opportunity, when migrating When machine arrives, according to record information, draw the value of each access data block, according to being worth shape from high to low Become queue；Processing described record information by information Valuation Modelling, the access information of data block includes accessing to be used Family, access time and data block information；

Data block migration: by accumulation layer second highest for costly data block migration to accumulation layer, low by being worth Data block migration is to the low accumulation layer of storage hierarchy；By queue filtering model and route matching model, at letter On the basis of the data block value queue that breath Valuation Modelling obtains after processing, form concrete data migration task, Migration Controlling model is utilized to complete Data Migration；Described queue filtering model is: fall to be not required to according to threshold filtering Data sectional to be migrated, all data sectionals in the queue formed after filtration all it has been determined that migratory direction, Threshold value reflects previous migration results in this storage hierarchy；Described route matching model is: in queue After all of piece all determines migratory direction, determine migration source close together and migrate target, migrating source excellent First selection remaining space is less, load is light, the node of normal mode of operation, migrates target priority and selects to load Light node；Described migration Controlling model is: carry out migration rate control, uses multithreading to perform in batches Described data migration task, reduces the transition process impact on cluster interior joint access performance；

Self-adaptative adjustment: after Data Migration completes, more new data block accesses information, restarts monitoring, tool Body step is:

Awaking monitoring process, waits the arrival of Data Migration next time.

Classification the most according to claim 1 storage power-economizing method, it is characterised in that: automatically divide in storage During level, described storage hierarchy at least includes 2 grades, and the criteria for classifying of storage hierarchy is: storage hierarchy is the highest, Access performance is the best, and the response time processing user's request is the shortest.

Classification the most according to claim 1 storage power-economizing method, it is characterised in that: by 40% two Level accumulation layer and 60% tertiary storage layer be adjusted to energy saver mode.