CN110535898A

CN110535898A - Copy storage, completion, node selecting method and management system in big data storage

Info

Publication number: CN110535898A
Application number: CN201810545954.9A
Authority: CN
Inventors: 丁博; 徐大青; 张展国; 贺彪; 杨迎春; 王少鹏; 刘一擎; 丁亮
Original assignee: State Grid Corp of China SGCC; Xuji Group Co Ltd; Xuchang XJ Software Technology Co Ltd
Current assignee: State Grid Corp of China SGCC; Xuji Group Co Ltd; Xuchang XJ Software Technology Co Ltd
Priority date: 2018-05-25
Filing date: 2018-05-25
Publication date: 2019-12-03
Anticipated expiration: 2038-05-25
Also published as: CN110535898B

Abstract

The invention relates to a method and management system for replica storage, completion, node selection and management system in big data storage. The node selection method is as follows: according to the real-time status information and historical fault information of each data node server, the evaluation index of the replica storage node is selected, wherein the data The predicted value of the probability of failure is included in the evaluation index, the weight of each evaluation index is determined, and the data node is obtained by the calculation of the weight for copy storage. Based on the above node selection method, a suitable node is selected for copy storage according to the three-copy scheme. When a replica failure needs to be completed, first complete the replica according to the active node on the rack where the faulty node is located. When the rack where the replica faulty node is located cannot work normally, select the active node with a similar failure rate to complete the replica. The invention effectively improves the writing efficiency and load balancing degree during storage without affecting the copy safety, and fundamentally solves the problem that the cluster needs load balancing after running for a long time.

Description

Copy storage, completion, node selection method and management system in big data storage

技术领域technical field

本发明属于大数据存储及云计算技术领域，具体涉及一种大数据存储中副本存放、补全、节点选择方法及管理系统。The invention belongs to the technical field of big data storage and cloud computing, and in particular relates to a method and management system for copy storage, completion, node selection in big data storage.

背景技术Background technique

现有的大数据存储系统一般为分布式存储(例如HDFS)，节点失效和硬件故障是必须要考虑的问题，副本技术的运用保证了系统的可靠性和可用性。保证数据副本的恰当放置和选择也就可以保证更加高效地实现数据的存取。大数据存储策略普遍采用三副本策略，保证了数据的安全性，可以有效地支撑分布式计算，但是在考虑数据的局部性访问时，副本分布的不合理，会对数据本地化要求高的计算产生影响，会将任务分配给存有副本但性能较低的机器进而导致整个集群的性能下降。Existing big data storage systems are generally distributed storage (such as HDFS), and node failure and hardware failure are issues that must be considered. The application of replica technology ensures the reliability and availability of the system. Ensuring proper placement and selection of copies of data also ensures more efficient access to data. The big data storage strategy generally adopts the three-copy strategy, which ensures the security of data and can effectively support distributed computing. However, when considering the localized access of data, the distribution of copies is unreasonable, which will require high data localization. The impact is that tasks will be distributed to machines with replicas but with lower performance, resulting in performance degradation of the entire cluster.

当前常用的副本存放策略中提到的负载均衡，主要从数据量的角度解决负载不均衡以后让已有数据进行负载均衡，此时负载均衡的实现其本质是副本的转移。负载均衡只能算是对副本存放策略不合理的一种补救方法。理想的方法应该是放置副本时可以根据当前集群的性能情况自主选择或调整副本存放的位置。The load balancing mentioned in the currently commonly used copy storage strategy mainly solves the load imbalance from the perspective of data volume and then balances the existing data. At this time, the essence of load balancing is the transfer of copies. Load balancing can only be regarded as a remedy for an unreasonable copy storage strategy. The ideal method should be to choose or adjust the location where the replica is stored according to the performance of the current cluster when placing the replica.

已有的副本存放策略中也提出了根据机架、硬盘使用量、负载情况进行放置的方法，但是不同方法在选择时参考的影响因素均比较单一，依然不能兼顾负载均衡和存储效率问题。尤其是在异构集群中，这类问题就更加明显，比如在一些性能较差的机器上可能存在过高的负载，有些性能较高的机器上却出现空闲的情况。大数据存储集群中发生副本的复制和转移一般是因为服务器的硬件故障造成的，一般服务器的设计使用时间为5-7年，单台服务器的实际使用年限与服务器批次，使用强度，使用环境都有关系。目前的副本存储方法中没有考虑针对服务器故障或老化的因素，如参考文献《基于支持向量机的HDFS副本放置改进策略》(作者：罗军等，计算机工程，2015年11月第41卷第11期)其仅考虑相对负载率、网络距离、磁盘性能、CPU性能和内存5个因素，当服务器发生故障时，数据的复制和迁移动作是随机的，这样就会造成副本放置的无序化。The existing copy storage strategy also proposes a method of placing according to the rack, hard disk usage, and load conditions. However, different methods refer to a single influencing factor when selecting them, and still cannot take into account the load balancing and storage efficiency issues. Especially in heterogeneous clusters, such problems are more obvious. For example, there may be excessive load on some machines with poor performance, while some machines with high performance are idle. Replication and transfer of replicas in big data storage clusters are generally caused by server hardware failures. Generally, servers are designed to be used for 5-7 years. The actual service life of a single server is related to server batches, usage intensity, and usage environment. are related. The current copy storage method does not consider the factors of server failure or aging, such as the reference "Improvement Strategy of HDFS Copy Placement Based on Support Vector Machine" (author: Luo Jun et al., Computer Engineering, Vol. 41, No. 11, Nov. 2015) Period) It only considers the relative load rate, network distance, disk performance, CPU performance and memory 5 factors, when the server fails, the data replication and migration actions are random, which will cause the disorder of replica placement.

从运筹学的角度分析，大数据的副本存放策略可以认为是一种决策性问题，而且是难以定量分析的决策性问题。针对此类问题，有一种层次分析法，将涉及到的评价指标进行分别比对，可以定量分析出几组关系的相对重要性，最后通过对各类影响因素加权获得最为理想的效果，理论上只要能够给出合适的权值就能得出理想的结果。From the perspective of operations research, the copy storage strategy of big data can be regarded as a decision-making problem, and it is a decision-making problem that is difficult to quantitatively analyze. In response to such problems, there is an analytic hierarchy process, which compares the evaluation indicators involved, and can quantitatively analyze the relative importance of several groups of relationships. Finally, the most ideal effect can be obtained by weighting various influencing factors. As long as the appropriate weights can be given, the ideal results can be obtained.

发明内容SUMMARY OF THE INVENTION

本发明提供给了一种大数据存储中副本存放、补全、节点选择方法及管理系统，以解决现有的集群副本存储策略中没有对各数据节点服务器状态进行有效感知，选择参考的影响因素比较单一，不能兼顾负载均衡和存储效率问题。The present invention provides a method and management system for copy storage, completion, node selection in big data storage, so as to solve the problem that the existing cluster copy storage strategy does not effectively perceive the server status of each data node and select the influencing factors for reference It is relatively simple and cannot take into account the issues of load balancing and storage efficiency.

为解决上述技术问题，本发明的大数据存储中副本存放节点选择方法包括以下四个单元方案：In order to solve the above technical problems, the method for selecting a replica storage node in the big data storage of the present invention includes the following four unit schemes:

单元方案一，根据各数据节点服务器的实时状态信息及历史故障信息选取副本存放节点的评价指标，包括磁盘使用率、磁盘I/O负载率、CPU负载率、内存负载率、读写任务连接率及节点故障率，所述读写任务连接率为当前服务器读写任务的连接数与文件系统允许的读写任务最大连接数的比值；确定各个评价指标的权值，然后根据以下公式计算出的参考值来选择数据节点作为副本存放位置：Unit scheme 1: Select the evaluation indicators of the replica storage node according to the real-time status information and historical fault information of each data node server, including disk usage, disk I/O load rate, CPU load rate, memory load rate, read and write task connection rate and the node failure rate, the read and write task connection rate is the ratio of the number of connections of the current server read and write tasks to the maximum number of connections of read and write tasks allowed by the file system; determine the weights of each evaluation index, and then calculate according to the following formula: Refer to the value to select the data node as the replica storage location:

ω＝λ₀ω_{disk_used}+λ₁ω_{disk_io}+λ₂ω_cpu+λ₃ω_mem+λ₄ω_process+λ₅ω_fr ω=λ ₀ ω _{disk_used} +λ ₁ ω _{disk_io} +λ ₂ ω _cpu +λ ₃ ω _mem +λ ₄ ω _process +λ ₅ ω _fr

其中，ω为数据节点选择参考值，ω_{disk_used}、ω_{disk_io}、ω_cpu、ω_mem、ω_process、ω_fr分别为磁盘使用率、磁盘I/O负载率、CPU负载率、内存负载率、读写任务连接率和节点故障率，λ₀+λ₁+λ₂+λ₃+λ₄+λ₅＝1，λ₀、λ₁、λ₂、λ₃、λ₄、λ₅、ω_{disk_used}、ω_{disk_io}、ω_cpu、ω_mem、ω_process、ω_fr∈[0,1]。Among them, ω is the data node selection reference value, ω _{disk_used} , ω _{disk_io} , ω _cpu , ω _mem , ω _process , ω _fr are the disk usage, disk I/O load rate, CPU load rate, memory load rate, read and write rate, respectively Task connection rate and node failure rate, λ ₀ +λ ₁ +λ ₂ +λ ₃ +λ ₄ +λ ₅ =1, λ ₀ , λ ₁ , λ ₂ , λ ₃ , λ ₄ , λ ₅ , ω _{disk_used} , ω _{disk_io} , ω _cpu , ω _mem , ω _process , ω _fr ∈[0,1].

单元方案二，在单元方案一的基础上，采用层次分析法确定各个评价指标的权值，将各个评价指标的权值描述为判断矩阵，对各评价指标进行分层，各层之间的联系实现定量分析，最终求取一个归一化特征向量作为判断矩阵；当感知到集群在处理不同任务时，自适应地匹配相应的矩阵，以修正合适的副本放置位置。Unit plan 2: On the basis of unit plan 1, the weights of each evaluation index are determined by the AHP method, the weights of each evaluation index are described as a judgment matrix, and each evaluation index is stratified, and the relationship between the layers Quantitative analysis is realized, and a normalized eigenvector is finally obtained as a judgment matrix; when it is perceived that the cluster is processing different tasks, the corresponding matrix is adaptively matched to correct the appropriate replica placement position.

单元方案三，在单元方案一的基础上，所述节点故障率为数据节点故障时间与在线运行时间的比值或数据节点已使用年限与设计使用年限的比值。Unit scheme three, on the basis of unit scheme one, the node failure rate is the ratio of the data node failure time to the online running time or the ratio of the data node's service life to the design service life.

单元方案四，在单元方案二的基础上，所述节点故障率为数据节点故障时间与在线运行时间的比值或数据节点已使用年限与设计使用年限的比值。Unit scheme 4, on the basis of unit scheme 2, the node failure rate is the ratio of the data node failure time to the online running time or the ratio of the data node's service life to the design service life.

本发明的大数据存储中副本存放方法包括以下四个单元方案：The method for storing copies in big data storage of the present invention includes the following four unit schemes:

单元方案一，该方法中默认副本数为3，其中两个副本存储在同一机架上的不同节点上，另外一个存放在不同机架的节点上，副本开始存储时，若客户端为数据节点，将第一个副本放置该节点上，若客户端不为数据节点，在所有机架上的节点中按照所述大数据存储中副本存放节点的选择方法选择节点用于放置第一副本；然后在与第一个副本不同机架的节点中按照所述大数据存储中副本存放节点的选择方法选择节点用于放置第二个副本，在与第二副本相同机架且不同节点中按照所述大数据存储中副本存放节点的选择方法选择节点用于放置第三副本。Unit scheme 1, the default number of copies in this method is 3, two copies are stored on different nodes on the same rack, and the other is stored on nodes in different racks. When the copy starts to be stored, if the client is a data node , place the first copy on the node, if the client is not a data node, select a node for placing the first copy in the nodes on all racks according to the method for selecting copy storage nodes in the big data storage; then In a node with a different rack from the first replica, select a node for placing the second replica according to the method for selecting replica storage nodes in the big data storage, and select a node for placing the second replica in the node on the same rack as the second replica but on a different node. Selection method of replica storage node in big data storage Select the node to place the third replica.

本发明的大数据存储中副本补全方法包括以下五个单元方案：The copy completion method in big data storage of the present invention includes the following five unit schemes:

单元方案一，当需要补全的副本数小于3时，获取各丢失副本所在的机架，并判断各故障节点所在机架上是否存在活动节点，若存在活动节点则在这些活动节点中按照设定的节点选择方法选择数据节点用于副本的补全，若不存在活动节点，则从与故障节点故障率相同的节点中按照设定的节点选择方法选择节点用于副本的补全。Unit scheme 1: When the number of replicas to be completed is less than 3, obtain the rack where each lost replica is located, and determine whether there are active nodes on the rack where each faulty node is located. The specified node selection method selects data nodes for replica completion. If there is no active node, the nodes with the same failure rate as the faulty node are selected for replica completion according to the set node selection method.

单元方案二，在单元方案一的基础上，所述设定的节点选择方法为：根据各数据节点服务器的实时状态信息及历史故障信息选取副本存放节点的评价指标，包括磁盘使用率、磁盘I/O负载率、CPU负载率、内存负载率、读写任务连接率及节点故障率，所述读写任务连接率为当前服务器读写任务的连接数与文件系统允许的读写任务最大连接数的比值；确定各个评价指标的权值，然后根据以下公式计算出的参考值来选择数据节点作为副本存放位置：Unit scheme 2, on the basis of unit scheme 1, the node selection method of the setting is: according to the real-time status information and historical fault information of each data node server, select the evaluation index of the copy storage node, including the disk usage rate, disk I /O load rate, CPU load rate, memory load rate, read-write task connection rate and node failure rate, the read-write task connection rate is the current server read-write task connection number and the maximum number of read-write tasks allowed by the file system. Determine the weight of each evaluation index, and then select the data node as the copy storage location according to the reference value calculated by the following formula:

单元方案三，在单元方案二的基础上，采用层次分析法确定各个评价指标的权值，将各个评价指标的权值描述为判断矩阵，对各评价指标进行分层，各层之间的联系实现定量分析，最终求取一个归一化特征向量作为判断矩阵；当感知到集群在处理不同任务时，自适应地匹配相应的矩阵，以修正合适的副本放置位置。Unit scheme 3, on the basis of unit scheme 2, adopts AHP to determine the weights of each evaluation index, describes the weights of each evaluation index as a judgment matrix, stratifies each evaluation index, and establishes the relationship between the layers. Quantitative analysis is realized, and a normalized eigenvector is finally obtained as a judgment matrix; when it is perceived that the cluster is processing different tasks, the corresponding matrix is adaptively matched to correct the appropriate replica placement position.

单元方案五，在单元方案三的基础上，所述节点故障率为数据节点故障时间与在线运行时间的比值或数据节点已使用年限与设计使用年限的比值。The fifth unit scheme, based on the third unit scheme, the node failure rate is the ratio of the failure time of the data node to the online running time or the ratio of the service life of the data node to the designed service life.

本发明的大数据存储中副本管理系统包括以下四个单元方案：The copy management system in the big data storage of the present invention includes the following four unit schemes:

单元方案一，该系统能够实现以下功能：根据各数据节点服务器的实时状态信息及历史故障信息选取副本存放节点的评价指标，包括磁盘使用率、磁盘I/O负载率、CPU负载率、内存负载率、读写任务连接率及节点故障率，所述读写任务连接率为当前服务器读写任务的连接数与文件系统允许的读写任务最大连接数的比值；确定各个评价指标的权值，然后根据以下公式计算出的参考值来选择数据节点作为副本存放位置：Unit scheme 1, the system can realize the following functions: select the evaluation indicators of the replica storage node according to the real-time status information and historical fault information of each data node server, including disk usage, disk I/O load rate, CPU load rate, memory load rate, read/write task connection rate and node failure rate, the read/write task connection rate is the ratio of the current server read/write task connection number to the maximum number of read/write task connections allowed by the file system; determine the weight of each evaluation index, Then select the data node as the replica storage location according to the reference value calculated by the following formula:

本发明的有益效果是：本发明对各数据节点服务器的实时状态信息和历史故障信息进行感知，为管理节点提供更加可靠的数据节点进行副本存储，在不影响副本安全的情况下有效提高存储时的写入效率和负载均衡程度，从根本上解决集群长时间运行以后需要负载均衡的问题。The beneficial effects of the present invention are: the present invention perceives the real-time status information and historical fault information of each data node server, provides a more reliable data node for copy storage for the management node, and effectively improves the storage time without affecting the copy security. The writing efficiency and load balancing degree of the cluster are basically solved, which fundamentally solves the problem of load balancing after the cluster runs for a long time.

本发明将服务器故障信息列入评价指标，将数据发生故障的概率的预测值作为数据副本存放的参考因素，当数据节点真正发生故障时，将需要补全的数据副本按照设定方法进行补全，避免数据存放的无序化，同时尽量减少副本补全时的开销。在一定程度上替代机架感知的功能。In the present invention, the server failure information is included in the evaluation index, and the predicted value of the probability of data failure is used as a reference factor for storing data copies. When a data node actually fails, the data copies that need to be completed are completed according to the setting method. , to avoid the disorder of data storage, and minimize the overhead of copy completion. To a certain extent, it replaces the function of rack awareness.

在副本故障需要补全时，优先按照故障节点所在的机架上的活动节点进行副本补全。当副本故障节点所在的机架不能正常工作时，优先选择故障率相近的活动节点进行副本补全，考虑到服务器部署时批次的因素，这样能尽量保证发生补全副本的节点与故障节点是在同一批次，进而保证副本储存在相近位置。When a replica fails and needs to be completed, the active node on the rack where the faulty node resides is given priority to complete the replica. When the rack where the replica faulty node is located cannot work normally, the active node with a similar failure rate is preferentially selected for replica completion. Taking into account the batch factor during server deployment, this can ensure that the node where the completed replica occurs is as close as possible to the faulty node. in the same batch, thus ensuring that copies are stored in close proximity.

附图说明Description of drawings

图1为本发明大数据存储中副本存放节点选择方法的设计参考模型图；Fig. 1 is a design reference model diagram of a method for selecting a replica storage node in big data storage according to the present invention;

图2为本发明大数据存储中副本存放流程图；Fig. 2 is the flow chart of copy storage in big data storage of the present invention;

图3为本发明大数据存储中副本补全流程图。FIG. 3 is a flow chart of copy completion in big data storage according to the present invention.

具体实施方式Detailed ways

下面结合附图对本发明的技术方案作进一步详细的说明。The technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings.

本发明大数据存储中副本存放节点选择方法实施例Embodiments of the method for selecting replica storage nodes in big data storage of the present invention

本实施例的方法为：对各数据节点服务器的实时状态信息和历史故障信息进行感知，根据各数据节点服务器的实时状态信息及历史故障信息选取副本存放节点的评价指标，包括磁盘使用率、磁盘I/O负载率、CPU负载率、内存负载率、读写任务连接率及节点故障率；确定各个评价指标的权值，然后根据以下公式计算出的参考值来选择数据节点作为副本存放位置：The method of this embodiment is to sense the real-time status information and historical fault information of each data node server, and select the evaluation index of the replica storage node according to the real-time status information and historical fault information of each data node server, including disk usage, disk I/O load rate, CPU load rate, memory load rate, read and write task connection rate and node failure rate; determine the weights of each evaluation index, and then select the data node as the replica storage location according to the reference value calculated by the following formula:

其中，ω_{disk_used}、ω_{disk_io}、ω_cpu、ω_mem可在集群服务器中相应操作系统中获取，读写任务连接率ω_process为当前服务器读写任务的连接数与文件系统允许的读写任务最大连接数的比值，ω_fr参数可以结合服务器的故障时间与在线运行时间的比值计算获得。在一些没有对故障进行记录的集群中也可以将此参数设计为已使用年限与设计使用年限的比值，所以对同一数据中心，该参数的模型设计与服务器的生产时间和部署时间均有关系。Among them, ω _{disk_used} , ω _{disk_io} , ω _cpu , ω _mem can be obtained from the corresponding operating system in the cluster server, and the read and write task connection rate ω _process is the number of connections of the current server read and write tasks and the maximum number of read and write tasks allowed by the file system. The ω _fr parameter can be calculated by combining the ratio of the server's failure time to the online running time. In some clusters that do not record failures, this parameter can also be designed as the ratio of the service life to the design service life. Therefore, for the same data center, the model design of this parameter is related to the production time and deployment time of the server.

各评价指标的权值一旦确定，副本的放置策略基本确定。评价指标的权值可以根据需要进行修改，修改后可将此方案收缩为按单一评价指标进行副本放置的方法，以适应特殊的工作场合。可以将各评价指标的权值描述为判断矩阵，当感知到集群在处理不同任务时，自适应地匹配相应的矩阵，以修正合适的副本位置。Once the weight of each evaluation index is determined, the placement strategy of the replica is basically determined. The weights of the evaluation indicators can be modified as needed. After modification, the scheme can be contracted to a method of placing copies according to a single evaluation indicator to suit special work situations. The weights of each evaluation index can be described as a judgment matrix. When it is perceived that the cluster is dealing with different tasks, the corresponding matrix is adaptively matched to correct the appropriate replica position.

评价指标的权值以矩阵[A B C D E F]表示，其中A、B、C、D、E、F分别表示上述六个评价指标，对各评价指标进行分层，各层之间的联系实现定量分析，如小写字母ab表示AB两层间的相对关系，可根据表1中各个评价指标之间的相对关系来确定各个评价指标的权重，最终求取一个归一化特征向量作为判断矩阵，由表1计算可得。如当感知到集群在处理不同任务时，优先考虑评价指标A，那么，评价指标A所占比重最大，然后根据评价指标A与其他五个评价指标之间的关系来为其他5个评价指标设置权重。设置好权重之后，根据各节点信息计算的参考值获得副本放置位置，设计思路如图1所示。The weights of the evaluation indicators are represented by a matrix [A B C D E F], in which A, B, C, D, E, and F represent the above six evaluation indicators, respectively. For example, the lowercase letter ab represents the relative relationship between the two layers of AB. The weight of each evaluation index can be determined according to the relative relationship between the evaluation indicators in Table 1, and finally a normalized eigenvector is obtained as the judgment matrix. Table 1 Calculated. For example, when it is perceived that the cluster is dealing with different tasks, the evaluation index A is given priority, then the evaluation index A accounts for the largest proportion, and then the other five evaluation indexes are set according to the relationship between the evaluation index A and the other five evaluation indexes. Weights. After the weight is set, the replica placement position is obtained according to the reference value calculated from the information of each node. The design idea is shown in Figure 1.

表1评价指标的相对关系矩阵Table 1 Relative relationship matrix of evaluation indicators

AA BB CC DD EE FF AA 11 abab acac adad aeae afaf BB 1/ab1/ab 11 bcbc bdbd bebe bfbf CC 1/ac1/ac 1/bc1/bc 11 cdcd cece cfcf DD 1/ad1/ad 1/bd1/bd 1/cd1/cd 11 dede dfdf EE 1/ae1/ae 1/be1/be 1/ce1/ce 1/de1/de 11 efef FF 1/af1/af 1/bf1/bf 1/cf1/cf 1/df1/df 1/ef1/ef 11

本发明大数据存储中副本存放方法实施例Embodiments of the copy storage method in big data storage of the present invention

本实施例的副本存放方法按照三副本方案存储，按照将副本放在两个机架上的原则保证副本的可靠性，即其中两个副本存储在同一机架上的不同节点上，另外一个存放在不同机架的节点上。副本开始存储时，若客户端为数据节点，将第一个副本放置该节点上，若客户端不为数据节点，在所有机架上的节点中按照上述实施例中的大数据存储中副本存放节点的选择方法选择节点用于放置第一副本；然后在与第一个副本同一机架的不同节点中按照上述实施例中的大数据存储中副本存放节点的选择方法选择节点用于放置第二个副本，在不同于第一和第二副本所在机架的机架上的节点中按照上述实施例中的大数据存储中副本存放节点的选择方法选择节点用于放置第三副本。The copy storage method of this embodiment is stored according to the three-copy scheme, and the reliability of the copies is ensured according to the principle of placing copies on two racks, that is, two copies are stored on different nodes on the same rack, and the other is stored on the same rack. on nodes in different racks. When the copy starts to be stored, if the client is a data node, the first copy is placed on the node. If the client is not a data node, the copies are stored in the nodes on all racks according to the big data storage in the above embodiment. The node selection method selects a node for placing the first copy; then selects a node for placing the second copy in a different node on the same rack as the first copy according to the method for selecting a copy storage node in the big data storage in the above-mentioned embodiment. In a node on a rack different from the rack where the first and second copies are located, the node is selected for placing the third copy according to the method for selecting a copy storage node in the large data storage in the foregoing embodiment.

具体的存放流程如图2所示，当需要存储的副本数大于0时，步骤1)判断要存储的是否为第一个副本，若是，进入步骤2)，否则进入步骤3)；The specific storage process is shown in Figure 2, when the number of copies to be stored is greater than 0, step 1) judges whether the first copy is to be stored, if so, enter step 2), otherwise enter step 3);

步骤2)判断客户端是否为数据节点，若是，进入步骤4)，否则进入步骤5)；Step 2) judge whether the client is a data node, if so, go to step 4), otherwise go to step 5);

步骤3)判断要存储的是否为第二个副本，若是，进入步骤6)，否则为第三个副本，进入步骤7)；Step 3) judge whether what is to be stored is the second copy, if so, enter step 6), otherwise it is the third copy, enter step 7);

步骤4)选择该据节点用于放置第一个副本；Step 4) Select the data node for placing the first copy;

步骤5)在所有机架上的节点中按照上述实施例中的大数据存储中副本存放节点的选择方法选择节点用于放置第一副本；Step 5) In the nodes on all racks, select a node for placing the first copy according to the method for selecting a copy storage node in the big data storage in the above-mentioned embodiment;

步骤6)在与第一个副本不同机架的节点中按照上述实施例中的大数据存储中副本存放节点的选择方法选择节点用于放置第二个副本；Step 6) In a node with a different rack from the first copy, select a node for placing the second copy according to the method for selecting a copy storage node in the big data storage in the above-mentioned embodiment;

步骤7)判断第一个副本和第二个副本是否存放在同一个机架上，若是，进入步骤8)，否则进入步骤9)；Step 7) judge whether the first copy and the second copy are stored on the same rack, if so, enter step 8), otherwise enter step 9);

步骤8)在不同于第一和第二副本所在机架的机架上的节点中按照上述实施例中的大数据存储中副本存放节点的选择方法选择节点用于放置第三副本，进入步骤10)；Step 8) In a node on a rack different from the rack where the first and second copies are located, select a node for placing the third copy according to the method for selecting a copy storage node in the large data storage in the above-mentioned embodiment, and enter step 10 );

步骤9)在与第二个副本同一机架的不同节点中按照上述实施例中的大数据存储中副本存放节点的选择方法选择节点用于放置第三副本，进入步骤10)；Step 9) In a different node in the same rack as the second copy, select a node for placing the third copy according to the method for selecting a copy storage node in the large data storage in the above-mentioned embodiment, and enter step 10);

步骤10)三副本放置完成，结束流程。Step 10) The placement of the three copies is completed, and the process ends.

本发明大数据存储中副本补全方法实施例Embodiments of the Copy Completion Method in Big Data Storage of the Present Invention

本实施例的补全方法为：当需要补全的副本数小于3时，获取各丢失副本所在的机架，并判断各故障节点所在机架上是否存在活动节点，若存在活动节点则在这些活动节点中按照设定的节点选择方法选择数据节点用于副本的补全，若不存在活动节点，则从与故障节点故障率相同的节点中按照设定的节点选择方法选择节点用于副本的补全。The completion method of this embodiment is: when the number of copies to be completed is less than 3, obtain the rack where each lost copy is located, and determine whether there is an active node on the rack where each faulty node is located, and if there is an active node, in these In the active node, the data node is selected for the completion of the copy according to the set node selection method. If there is no active node, the node with the same failure rate as the faulty node is selected for the copy according to the set node selection method. Completion.

具体的补全流程如图3所示，当需要补全的副本数大于0时，判断需要补全的副本数是否等于3，若等于3，进入步骤4)，若小于3，进入步骤1)；The specific completion process is shown in Figure 3. When the number of copies to be completed is greater than 0, it is judged whether the number of copies to be completed is equal to 3, if it is equal to 3, go to step 4), if it is less than 3, go to step 1) ;

步骤1)获取各丢失副本所在的机架，并判断各故障节点所在机架上是否存在活动节点，若存在活动节点，进入步骤2)，否则进入步骤3)；Step 1) Obtain the rack where each lost copy is located, and determine whether there is an active node on the rack where each faulty node is located, if there is an active node, go to step 2), otherwise go to step 3);

步骤2)枚举故障节点所在机架的活动节点，按照设定的节点选择方法计算合适的节点进行副本补全；Step 2) enumerate the active nodes of the rack where the faulty node is located, and calculate suitable nodes according to the set node selection method to complete the copy;

步骤3)枚举与故障节点故障率相同的节点，按照设定的节点选择方法计算合适的节点进行副本补全；Step 3) enumerate the nodes with the same failure rate as the faulty node, and calculate the appropriate node according to the set node selection method to complete the copy;

步骤4)产生坏块，进行报警。Step 4) A bad block is generated, and an alarm is issued.

上述步骤2)、3)中所述的设定的节点选择方法可以采用上述的大数据存储中副本存放节点的选择方法，也可以采用现有技术中的其他节点选择方法，这里不再详细说明。The set node selection method described in the above steps 2) and 3) may adopt the above-mentioned selection method of the replica storage node in the big data storage, or may adopt other node selection methods in the prior art, which will not be described in detail here. .

本发明大数据存储中副本管理系统实施例The embodiment of the copy management system in the big data storage of the present invention

本实施例的大数据存储中副本管理系统为能够实现一种大数据存储中副本存放节点选择方法的管理平台。其中的大数据存储中副本存放节点选择方法可参见上述实施例，这里不再详细介绍。The copy management system in big data storage in this embodiment is a management platform capable of implementing a method for selecting a node for copy storage in big data storage. For the method for selecting a replica storage node in the big data storage, reference may be made to the above-mentioned embodiment, which will not be described in detail here.

以上给出了本发明涉及的具体实施方式，但本发明不局限于所描述的实施方式。在本发明给出的思路下，采用对本领域技术人员而言容易想到的方式对上述实施例中的技术手段进行变换、替换、修改，并且起到的作用与本发明中的相应技术手段基本相同、实现的发明目的也基本相同，这样形成的技术方案是对上述实施例进行微调形成的，这种技术方案仍落入本发明的保护范围内。The specific embodiments to which the present invention relates are given above, but the present invention is not limited to the described embodiments. Under the idea given by the present invention, the technical means in the above-mentioned embodiments are transformed, replaced and modified in a manner that is easy for those skilled in the art to imagine, and the functions played are basically the same as those of the corresponding technical means in the present invention. The purpose of the invention is basically the same. The technical solution formed in this way is formed by fine-tuning the above embodiment, and this technical solution still falls within the protection scope of the present invention.

Claims

1. the selection method of copy storage node in big data storage, which is characterized in that this method are as follows: taken according to each back end The real time status information and historical failure information of business device choose the evaluation index of copy storage node, including disk utilization rate, magnetic Disk I/O load factor, cpu load rate, memory load factor, read-write task bonding ratio and node failure rate, the read-write task connection Rate is the ratio for the read-write task maximum connection number that current server reads and writes the connection number of task and file system allows；It determines each The weight of a evaluation index, then calculated reference value selects the back end as copy storage position according to the following formula It sets:

ω=λ₀ω_{disk_used}+λ₁ω_{disk_io}+λ₂ω_cpu+λ₃ω_mem+λ₄ω_process+λ₅ω_fr

Wherein, ω is that back end selects reference value, ω_{disk_used}、ω_{disk_io}、ω_cpu、ω_mem、ω_process、ω_frRespectively magnetic Disk utilization rate, magnetic disc i/o load factor, cpu load rate, memory load factor, read-write task bonding ratio and node failure rate, λ₀+λ₁+ λ₂+λ₃+λ₄+λ₅=1, λ₀、λ₁、λ₂、λ₃、λ₄、λ₅、ω_{disk_used}、ω_{disk_io}、ω_cpu、ω_mem、ω_process、ω_fr∈[0,1]。

2. the selection method of copy storage node in big data storage according to claim 1, which is characterized in that use layer Fractional analysis determines the weight of each evaluation index, and the weight of each evaluation index is described as judgment matrix, refers to each evaluation Mark is layered, and quantitative analysis is realized in the connection between each layer, finally seeks a normalization characteristic vector as judgment matrix； When perceiving cluster when handling different task, corresponding matrix is adaptively matched, to correct suitable Replica placement position.

3. the selection method of copy storage node in big data storage according to claim 1 or 2, which is characterized in that institute State the ratio or back end years already spent and design that node failure rate is back end fault time and on-line operation time The ratio of service life.

4. using copy is deposited in the big data storage of the selection method of copy storage node in the storage of big data described in claim 1 Put method, default number of copies is 3 in this method, and two of them copy is stored on the different nodes in same rack, in addition one On a node for being stored in different racks, which is characterized in that when copy starts storage, if client is back end, by first On a Replica placement node, if client is not back end, deposited in the node on institute's organic frame according to the big data The selection method selection node of copy storage node is for placing the first authentic copy in storage；Then with first copy difference rack Node according to the big data storage in copy storage node selection method selection node for place second copy, It is selected from triplicate same machine frame and different nodes according to the selection method of copy storage node in big data storage Node is selected for placing triplicate.

5. copy deposit method in big data storage according to claim 4, which is characterized in that true using analytic hierarchy process (AHP) The weight of fixed each evaluation index, is described as judgment matrix for the weight of each evaluation index, is layered to each evaluation index, Quantitative analysis is realized in connection between each layer, finally seeks a normalization characteristic vector as judgment matrix；Collect when perceiving Group matches corresponding matrix, adaptively when handling different task to correct suitable Replica placement position.

6. copy complementing method in big data storage, which is characterized in that when needing the number of copies of completion less than 3, acquisition is respectively lost The rack where copy is lost, and judges that each malfunctioning node whether there is active node on the rack, active node is then if it exists The completion of copy is used for according to the node selecting method selection back end of setting in these active nodes, it is movable if it does not exist Node, then the node selecting method selection node from node identical with malfunctioning node failure rate according to setting is for copy Completion.

7. copy complementing method in big data storage according to claim 6, which is characterized in that the node of the setting selects Method are as follows: the evaluation of copy storage node is chosen according to the real time status information of each back end server and historical failure information Index, including disk utilization rate, magnetic disc i/o load factor, cpu load rate, memory load factor, read-write task bonding ratio and node event Barrier rate, the read-write task that the read-write task bonding ratio reads and writes the connection number of task for current server and file system allows is most The ratio of big connection number；Determine the weight of each evaluation index, then calculated reference value selects number according to the following formula According to node as copy storage position:

8. copy complementing method in big data storage according to claim 7, which is characterized in that true using analytic hierarchy process (AHP) The weight of fixed each evaluation index, is described as judgment matrix for the weight of each evaluation index, is layered to each evaluation index, Quantitative analysis is realized in connection between each layer, finally seeks a normalization characteristic vector as judgment matrix；Collect when perceiving Group matches corresponding matrix, adaptively when handling different task to correct suitable Replica placement position.

9. replica management system in big data storage, which is characterized in that the system can be realized following functions: according to each data section The real time status information and historical failure information of point server choose the evaluation index of copy storage node, including disk uses Rate, magnetic disc i/o load factor, cpu load rate, memory load factor, read-write task bonding ratio and node failure rate, the read-write task Bonding ratio is the ratio for the read-write task maximum connection number that current server reads and writes the connection number of task and file system allows；Really The weight of fixed each evaluation index, then calculated reference value selects the back end as copy storage according to the following formula Position:

10. replica management system in big data storage according to claim 9, which is characterized in that use analytic hierarchy process (AHP) The weight of each evaluation index is described as judgment matrix, divided each evaluation index by the weight for determining each evaluation index Layer, quantitative analysis is realized in the connection between each layer, finally seeks a normalization characteristic vector as judgment matrix；When perceiving Cluster adaptively matches corresponding matrix when handling different task, to correct suitable Replica placement position.