CN110535898B

CN110535898B - Method for storing and complementing copies and selecting nodes in big data storage and management system

Info

Publication number: CN110535898B
Application number: CN201810545954.9A
Authority: CN
Inventors: 丁博; 徐大青; 张展国; 贺彪; 杨迎春; 王少鹏; 刘一擎; 丁亮
Original assignee: State Grid Corp of China SGCC; Xuji Group Co Ltd; Xuchang XJ Software Technology Co Ltd
Current assignee: State Grid Corp of China SGCC; Xuji Group Co Ltd; Xuchang XJ Software Technology Co Ltd
Priority date: 2018-05-25
Filing date: 2018-05-25
Publication date: 2022-10-04
Anticipated expiration: 2038-05-25
Also published as: CN110535898A

Abstract

The invention relates to a method for storing and complementing copies in big data storage and a node selection method and a management system, wherein the node selection method comprises the following steps: and selecting evaluation indexes of the replica storage nodes according to the real-time state information and the historical fault information of each data node server, listing the predicted value of the probability of data fault into the evaluation indexes, determining the weight of each evaluation index, and calculating according to the weight to obtain the data nodes for replica storage. Based on the node selection method, appropriate nodes are selected according to the three schemes to store the copies. When the duplicate failure needs to be completed, firstly, the duplicate completion is carried out according to the active node on the rack where the failure node is located, and when the rack where the duplicate failure node is located cannot work normally, the active node with the similar failure rate is selected to carry out the duplicate completion. The invention effectively improves the writing efficiency and the load balancing degree during storage under the condition of not influencing the copy safety, and fundamentally solves the problem that the cluster needs load balancing after long-time running.

Description

Method for storing and complementing copies and selecting nodes in big data storage and management system

Technical Field

The invention belongs to the technical field of big data storage and cloud computing, and particularly relates to a copy storage, completion and node selection method and a management system in big data storage.

Background

The existing large data storage system is generally distributed storage (such as HDFS), node failure and hardware failure are problems which must be considered, and the reliability and the availability of the system are ensured by the application of the replica technology. Ensuring proper placement and selection of copies of data also ensures that access to the data is more efficiently achieved. Three copy strategies are commonly adopted in a big data storage strategy, so that the safety of data is guaranteed, distributed computation can be effectively supported, however, when local access of the data is considered, unreasonable copy distribution can affect computation with high data localization requirements, tasks can be allocated to machines which store copies and have low performance, and the performance of the whole cluster is reduced.

Load balancing mentioned in a currently common copy storage strategy is mainly used for solving load imbalance from the perspective of data volume and then carrying out load balancing on existing data, and the implementation of load balancing is essentially copy transfer. Load balancing can only be a remedy to unreasonable copy storage policies. The ideal method is to autonomously select or adjust the storage position of the copy according to the performance condition of the current cluster when the copy is placed.

The existing copy storage strategy also provides a method for placing according to the use amount of the rack and the hard disk and the load condition, but the influence factors referred by different methods during selection are single, and the problems of load balance and storage efficiency still cannot be considered. This problem is particularly pronounced in heterogeneous clusters, where, for example, there may be too high a load on some of the less performing machines and some of the more performing machines may be idle. The duplication and transfer of the copies in the large data storage cluster are generally caused by hardware faults of the servers, the design service life of the general servers is 5-7 years, and the actual service life of a single server is related to the batch, the service intensity and the service environment of the server. The existing copy storage method does not consider factors aiming at server failure or aging, for example, the reference "improved strategy for placing copies of HDFS based on support vector machine" (author: army, computer engineering, vol. 41, vol. 11, p. 2015 11) only considers 5 factors of relative load rate, network distance, disk performance, CPU performance and memory, when a server fails, the actions of copying and migrating data are random, which causes disorder of copy placement.

From the perspective of operations research, the strategy of storing the big data in the copy can be considered as a decision problem and is difficult to analyze quantitatively. Aiming at the problems, a hierarchical analysis method is provided, the related evaluation indexes are respectively compared, the relative importance of a plurality of groups of relations can be quantitatively analyzed, finally, the most ideal effect is obtained by weighting various influence factors, and theoretically, the ideal result can be obtained as long as the proper weight can be given.

Disclosure of Invention

The invention provides a method for storing, complementing and selecting a copy in big data storage and a management system, which are used for solving the problems that the existing cluster copy storage strategy does not effectively sense the state of each data node server, the influence factor of selection reference is single, and load balance and storage efficiency cannot be considered at the same time.

In order to solve the technical problem, the method for selecting the copy storage node in the big data storage comprises the following four unit schemes:

according to the first unit scheme, evaluation indexes of copy storage nodes are selected according to real-time state information and historical fault information of each data node server, wherein the evaluation indexes comprise a disk utilization rate, a disk I/O load rate, a CPU load rate, a memory load rate, a read-write task connection rate and a node fault rate, and the read-write task connection rate is the ratio of the connection number of read-write tasks of a current server to the maximum connection number of the read-write tasks allowed by a file system; determining the weight of each evaluation index, and then selecting the data node as a copy storage position according to a reference value calculated by the following formula:

ω＝λ ₀ ω _{disk_used} +λ ₁ ω _{disk_io} +λ ₂ ω _cpu +λ ₃ ω _mem +λ ₄ ω _process +λ ₅ ω _fr

wherein ω is a reference value selected for the data node, ω _{disk_used} 、ω _{disk_io} 、ω _cpu 、ω _mem 、ω _process 、ω _fr Respectively the disk utilization rate, the disk I/O load rate, the CPU load rate, the memory load rate, the read-write task connection rate and the node failure rate, lambda ₀ +λ ₁ +λ ₂ +λ ₃ +λ ₄ +λ ₅ ＝1，λ ₀ 、λ ₁ 、λ ₂ 、λ ₃ 、λ ₄ 、λ ₅ 、ω _{disk_used} 、ω _{disk_io} 、ω _cpu 、ω _mem 、ω _process 、ω _fr ∈[0,1]。

In the unit scheme II, on the basis of the unit scheme I, a hierarchy analysis method is adopted to determine the weight of each evaluation index, the weight of each evaluation index is described as a judgment matrix, each evaluation index is layered, the relation among the layers realizes quantitative analysis, and finally a normalized feature vector is obtained as the judgment matrix; when the cluster is perceived to be processing different tasks, the corresponding matrix is adaptively matched to correct the appropriate copy placement position.

And in the unit scheme III, on the basis of the unit scheme I, the node failure rate is the ratio of the failure time of the data node to the online running time or the ratio of the service life of the data node to the design service life.

And a fourth unit scheme, wherein on the basis of the second unit scheme, the node failure rate is the ratio of the failure time of the data node to the online operation time or the ratio of the service life of the data node to the design service life.

The method for storing the copy in the big data storage comprises the following four unit schemes:

in the first unit scheme, the number of the default copies is 3, wherein two copies are stored in different nodes on the same rack, the other copy is stored in a node of a different rack, when the copies are stored, if a client is a data node, the first copy is placed on the node, and if the client is not the data node, the node is selected from the nodes on all the racks according to the selection method of the copy storage node in the big data storage for placing the first copy; and then selecting a node for placing a second copy in a node of a rack different from the first copy according to the selection method of the copy storage node in the big data storage, and selecting a node for placing a third copy in a rack same as the second copy and different from the first copy according to the selection method of the copy storage node in the big data storage.

The method for complementing the copy in the big data storage comprises the following five unit schemes:

in the first unit scheme, when the number of copies to be complemented is less than 3, a rack where each lost copy is located is obtained, whether an active node exists on the rack where each fault node is located is judged, if the active node exists, a data node is selected from the active nodes according to a set node selection method for complementing the copies, and if the active node does not exist, a node is selected from the nodes with the same fault rate as the fault node according to the set node selection method for complementing the copies.

In the second unit scheme, on the basis of the first unit scheme, the set node selection method is as follows: selecting evaluation indexes of the copy storage nodes according to the real-time state information and the historical fault information of each data node server, wherein the evaluation indexes comprise a disk utilization rate, a disk I/O load rate, a CPU load rate, a memory load rate, a read-write task connection rate and a node fault rate, and the read-write task connection rate is the ratio of the connection number of the read-write tasks of the current server to the maximum connection number of the read-write tasks allowed by the file system; determining the weight of each evaluation index, and then selecting the data node as a copy storage position according to a reference value calculated by the following formula:

In the unit scheme III, on the basis of the unit scheme II, a hierarchy analysis method is adopted to determine the weight of each evaluation index, the weight of each evaluation index is described as a judgment matrix, each evaluation index is layered, the relation among the layers realizes quantitative analysis, and finally a normalized feature vector is obtained as the judgment matrix; when a cluster is perceived to be processing a different task, the corresponding matrix is adaptively matched to correct the appropriate copy placement position.

And a fifth unit scheme, wherein on the basis of the third unit scheme, the node failure rate is the ratio of the failure time of the data node to the online operation time or the ratio of the service life of the data node to the design service life.

The copy management system in the big data storage comprises the following four unit schemes:

in the first unit scheme, the system can realize the following functions: selecting evaluation indexes of the copy storage nodes according to the real-time state information and the historical fault information of each data node server, wherein the evaluation indexes comprise a disk utilization rate, a disk I/O load rate, a CPU load rate, a memory load rate, a read-write task connection rate and a node fault rate, and the read-write task connection rate is the ratio of the connection number of the read-write tasks of the current server to the maximum connection number of the read-write tasks allowed by the file system; determining the weight of each evaluation index, and then selecting the data node as a copy storage position according to a reference value calculated by the following formula:

wherein ω is a reference value selected for the data node, ω _{disk_used} 、ω _{disk_io} 、ω _cpu 、ω _mem 、ω _process 、ω _fr Respectively being the disk utilization rate, the disk I/O load rate, the CPU load rate, the memory load rate, the read-write task connection rate and the node failure rate, lambda ₀ +λ ₁ +λ ₂ +λ ₃ +λ ₄ +λ ₅ ＝1，λ ₀ 、λ ₁ 、λ ₂ 、λ ₃ 、λ ₄ 、λ ₅ 、ω _{disk_used} 、ω _{disk_io} 、ω _cpu 、ω _mem 、ω _process 、ω _fr ∈[0,1]。

According to the second unit scheme, on the basis of the first unit scheme, the weight of each evaluation index is determined by adopting an analytic hierarchy process, the weight of each evaluation index is described as a judgment matrix, each evaluation index is layered, the relation among layers realizes quantitative analysis, and finally a normalized feature vector is obtained to serve as the judgment matrix; when a cluster is perceived to be processing a different task, the corresponding matrix is adaptively matched to correct the appropriate copy placement position.

And on the basis of the unit scheme II, the node failure rate is the ratio of the data node failure time to the online running time or the ratio of the used life of the data node to the design life.

The invention has the beneficial effects that: the invention senses the real-time state information and the historical fault information of each data node server, provides more reliable data nodes for the management node to store the copy, effectively improves the writing efficiency and the load balancing degree during storage under the condition of not influencing the copy safety, and fundamentally solves the problem that the cluster needs load balancing after long-time operation.

According to the method, the server fault information is listed as an evaluation index, the predicted value of the probability of data fault is used as a reference factor for storing the data copy, and when the data node really has fault, the data copy to be supplemented is supplemented according to a set method, so that disorder of data storage is avoided, and the overhead of the copy supplement is reduced as much as possible. Replacing the rack-aware functionality to some extent.

And when the copy fault needs to be completed, preferentially performing copy completion according to the active node on the rack where the fault node is located. When the rack where the replica fault node is located cannot work normally, the active nodes with similar fault rates are preferentially selected for replica completion, and batch factors during server deployment are considered, so that the situation that the nodes with the completed replicas and the fault node are in the same batch can be ensured as much as possible, and the replicas are further ensured to be stored at similar positions.

Drawings

FIG. 1 is a design reference model diagram of a copy storage node selection method in big data storage according to the present invention;

FIG. 2 is a flow chart of copy storage in big data storage according to the present invention;

FIG. 3 is a flow chart of copy completion in big data storage according to the present invention.

Detailed Description

The technical scheme of the invention is further explained in detail by combining the attached drawings.

Embodiment of method for selecting copy storage node in big data storage

The method of the embodiment comprises the following steps: sensing the real-time state information and the historical fault information of each data node server, and selecting evaluation indexes of the copy storage node according to the real-time state information and the historical fault information of each data node server, wherein the evaluation indexes comprise a disk utilization rate, a disk I/O load rate, a CPU load rate, a memory load rate, a read-write task connection rate and a node fault rate; determining the weight of each evaluation index, and then selecting the data node as a copy storage position according to a reference value calculated by the following formula:

where ω is a reference selected for the data nodeValue, ω _{disk_used} 、ω _{disk_io} 、ω _cpu 、ω _mem 、ω _process 、ω _fr Respectively the disk utilization rate, the disk I/O load rate, the CPU load rate, the memory load rate, the read-write task connection rate and the node failure rate, lambda ₀ +λ ₁ +λ ₂ +λ ₃ +λ ₄ +λ ₅ ＝1，λ ₀ 、λ ₁ 、λ ₂ 、λ ₃ 、λ ₄ 、λ ₅ 、ω _{disk_used} 、ω _{disk_io} 、ω _cpu 、ω _mem 、ω _process 、ω _fr ∈[0,1]。

Wherein, ω is _{disk_used} 、ω _{disk_io} 、ω _cpu 、ω _mem Can be obtained from the corresponding operating system in the cluster server, and the read-write task connection rate omega _process Is the ratio of the number of the read-write tasks of the current server to the maximum number of the read-write tasks allowed by the file system, omega _fr The parameters may be calculated in combination with the ratio of the down time to the online runtime of the server. In some clusters where no fault is recorded, the parameter can also be designed as the ratio of the used life to the designed life, so that the model design of the parameter has a relationship with both the production time and the deployment time of the server for the same data center.

Once the weight value of each evaluation index is determined, the placement strategy of the copy is basically determined. The weight of the evaluation index can be modified according to the requirement, and the scheme can be contracted into a method for placing copies according to a single evaluation index after modification so as to adapt to special working occasions. The weight of each evaluation index can be described as a judgment matrix, and when the cluster is sensed to process different tasks, the corresponding matrix is adaptively matched so as to correct the proper copy position.

The weight of the evaluation index is represented by a matrix [ A B C D E F ], wherein A, B, C, D, E and F respectively represent the six evaluation indexes, each evaluation index is layered, the relation between each layer realizes quantitative analysis, for example, a lowercase AB represents the relative relation between two layers AB, the weight of each evaluation index can be determined according to the relative relation between each evaluation index in the table 1, finally, a normalized feature vector is obtained as a judgment matrix, and the judgment matrix can be obtained by calculation of the table 1. If the evaluation index A is considered preferentially when the cluster is perceived to process different tasks, the proportion of the evaluation index A is the largest, and then weights are set for other 5 evaluation indexes according to the relationship between the evaluation index A and other five evaluation indexes. After the weight is set, the copy placement position is obtained according to the reference value calculated by each node information, and the design idea is shown in fig. 1.

TABLE 1 relative relationship matrix of evaluation indexes

A

B

C

D

E

F

A

1

ab

ac

ad

ae

af

B

1/ab

1

bc

bd

be

bf

C

1/ac

1/bc

1

cd

ce

cf

D

1/ad

1/bd

1/cd

1

de

df

E

1/ae

1/be

1/ce

1/de

1

ef

F

1/af

1/bf

1/cf

1/df

1/ef

1

Embodiment of copy storage method in big data storage

The copy storage method of the embodiment stores the copies according to three schemes, and ensures the reliability of the copies according to the principle that the copies are placed on two racks, namely two copies are stored on different nodes on the same rack, and the other copies are stored on nodes of different racks. When the copies start to be stored, if the client is a data node, placing the first copy on the node, and if the client is not a data node, selecting a node from the nodes on all the racks according to the selection method of the copy storage node in the big data storage in the embodiment to place the first copy; then, in a different node of the same rack as the first copy, a node is selected according to the selection method of the copy storage node in the big data storage in the above embodiment to place a second copy, and in a node on a rack different from the rack where the first and second copies are located, a node is selected according to the selection method of the copy storage node in the big data storage in the above embodiment to place a third copy.

The specific storage process is shown in fig. 2, when the number of copies to be stored is greater than 0, step 1) judges whether the copy to be stored is the first copy, if so, step 2) is performed, otherwise, step 3) is performed;

step 2) judging whether the client is a data node, if so, entering step 4), and if not, entering step 5);

step 3) judging whether the copy to be stored is a second copy, if so, entering step 6), otherwise, entering step 7) if the copy is a third copy;

step 4) selecting the data node for placing a first copy;

step 5) selecting nodes from all the nodes on the rack according to the selection method of the copy storage nodes in the big data storage in the embodiment to place the first copies;

step 6) selecting a node from the nodes of the racks different from the first copy for placing a second copy according to the selection method of the copy storage node in the big data storage in the embodiment;

step 7) judging whether the first copy and the second copy are stored on the same rack, if so, entering step 8), and if not, entering step 9);

step 8) selecting a node for placing a third copy in a node on a rack different from the rack where the first copy and the second copy are located according to the selection method of the copy storage node in the big data storage in the embodiment, and entering step 10);

step 9) selecting a node for placing a third copy in a different node of the same rack as the second copy according to the selection method of the copy storage node in the big data storage in the embodiment, and entering step 10);

and step 10), finishing the placement of the three copies and finishing the process.

Embodiment of method for completing duplicate copy in big data storage

The completion method of the embodiment is as follows: and when the number of the copies needing to be complemented is less than 3, acquiring a rack where each lost copy is located, judging whether an active node exists on the rack where each fault node is located, if so, selecting a data node from the active nodes according to a set node selection method for complementing the copies, and if not, selecting a node from the nodes with the same fault rate as the fault node according to the set node selection method for complementing the copies.

As shown in fig. 3, when the number of copies to be complemented is greater than 0, determining whether the number of copies to be complemented is equal to 3, if so, entering step 4), and if not, entering step 1);

step 1) obtaining a rack where each lost copy is located, judging whether a movable node exists on the rack where each fault node is located, if so, entering step 2), and if not, entering step 3);

step 2) enumerating active nodes of a rack where the fault nodes are located, and calculating appropriate nodes according to a set node selection method to complete the copy;

step 3) enumerating nodes with the same failure rate as the failed nodes, and calculating appropriate nodes according to a set node selection method to complete the copy;

and 4) generating a bad block and alarming.

The node selection method set in the above step 2) and step 3) may adopt the above selection method of the copy storage node in the big data storage, and may also adopt other node selection methods in the prior art, which are not described in detail here.

The embodiment of the copy management system in big data storage of the invention

The copy management system in the big data storage of the embodiment is a management platform capable of realizing a method for selecting copy storage nodes in the big data storage. The method for selecting the copy storage node in the big data storage can be referred to the above embodiments, and is not described in detail here.

The present invention has been described in relation to particular embodiments thereof, but the invention is not limited to the described embodiments. In the thought given by the present invention, the technical means in the above embodiments are changed, replaced, modified in a manner that is easily imaginable to those skilled in the art, and the functions are basically the same as the corresponding technical means in the present invention, and the purpose of the invention is basically the same, so that the technical scheme formed by fine tuning the above embodiments still falls into the protection scope of the present invention.

Claims

1. The method for completing the copies in the big data storage is characterized in that when the number of the copies needing to be completed is less than 3, a rack where each lost copy is located is obtained, whether an active node exists on the rack where each fault node is located is judged, if the active node exists, a data node is selected from the active nodes according to a set node selection method to be used for completing the copies, and if the active node does not exist, a node is selected from the nodes with the same fault rate as the fault node according to the set node selection method to be used for completing the copies;

the set node selection method comprises the following steps: selecting evaluation indexes of the copy storage nodes according to the real-time state information and the historical fault information of each data node server, wherein the evaluation indexes comprise a disk utilization rate, a disk I/O load rate, a CPU load rate, a memory load rate, a read-write task connection rate and a node fault rate, and the read-write task connection rate is the ratio of the connection number of the read-write tasks of the current server to the maximum connection number of the read-write tasks allowed by the file system; determining the weight of each evaluation index, and then selecting the data node as a copy storage position according to a reference value calculated by the following formula:

2. The method for completing the duplicates in the big data storage according to claim 1, wherein an analytic hierarchy process is used to determine the weight of each evaluation index, the weight of each evaluation index is described as a judgment matrix, each evaluation index is layered, the connection between each layer realizes quantitative analysis, and finally, a normalized feature vector is obtained as the judgment matrix; when the cluster is perceived to be processing different tasks, the corresponding matrix is adaptively matched to correct the appropriate copy placement position.

3. The method for completing duplicates in a big data storage according to claim 1 or 2, wherein said node failure rate is a ratio of data node failure time to online running time or a ratio of data node used time to design used time.