WO2021073083A1 - Node load-based dynamic data partitioning system - Google Patents

Node load-based dynamic data partitioning system Download PDF

Info

Publication number
WO2021073083A1
WO2021073083A1 PCT/CN2020/090554 CN2020090554W WO2021073083A1 WO 2021073083 A1 WO2021073083 A1 WO 2021073083A1 CN 2020090554 W CN2020090554 W CN 2020090554W WO 2021073083 A1 WO2021073083 A1 WO 2021073083A1
Authority
WO
WIPO (PCT)
Prior art keywords
load
node
value
data
weight
Prior art date
Application number
PCT/CN2020/090554
Other languages
French (fr)
Chinese (zh)
Inventor
孟令伍
贺成龙
吴嘉逸
丁灿
刘蛰
李惠柯
顾学海
姜吉宁
陈铮
Original Assignee
南京莱斯网信技术研究院有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京莱斯网信技术研究院有限公司 filed Critical 南京莱斯网信技术研究院有限公司
Publication of WO2021073083A1 publication Critical patent/WO2021073083A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3037Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • G06F9/5088Techniques for rebalancing the load in a distributed system involving task migration

Definitions

  • the invention relates to the field of big data distributed computing and storage, in particular to a data dynamic partition system based on node load.
  • Data partitioning refers to the distribution of data in a distributed system environment. It is necessary to follow a certain method strategy and adopt a partitioning strategy to store the entire data reasonably on each physical data node in the cluster. Simple data partitioning is easy to do, but to make the system run efficiently and steadily, it is necessary to study and design a corresponding partitioning strategy.
  • the technical problem to be solved by the present invention is to provide a system of Memsql partition strategy for Spark, which dynamically adjusts the load balance of distributed computing and improves the response speed of data analysis.
  • the present invention provides a data dynamic partition system based on node load, which is a system based on a data dynamic partition mechanism and strategy of node load.
  • the system includes a load monitoring module, an acquisition module, a data pre-partitioning module, and a data migration module;
  • the load monitoring module is used to select load information indicators and monitor the load information indicator values on each node in the distributed cluster in real time;
  • the collection module is used to periodically collect the load information index value on each node in the distributed cluster
  • the data pre-partitioning module is used to predict the load information index value on each node in the distributed cluster, and then obtain the processing capacity of each node according to the index weight method, and finally distribute different amounts of data according to the processing capacity of each node , Complete data pre-partitioning;
  • the data migration module is used to trigger data migration between nodes to improve load balance when a load imbalance problem occurs in the distributed cluster.
  • the load monitoring module selects CPU utilization, memory utilization, and bandwidth utilization as load information index values, and monitors the load on each node in the distributed cluster in real time by deploying Memsql (distributed memory database) resource monitoring service Information index value.
  • Memsql distributed memory database
  • the collection module periodically obtains the load information index value on each node in the distributed cluster through the API (program interface) provided by the distributed Yarn resource management component, and saves it in the database.
  • the data pre-partitioning module is used to predict the load information index value on each node in the distributed cluster, and then obtain the processing capacity of each node according to the AHP (Analytic Hierarchy Process) and entropy subjective and objective index weight integration method, and finally According to the processing capacity of each node, different data volumes are distributed to complete the data pre-partitioning, which specifically includes the following steps:
  • Step 1 Use the quadratic exponential smoothing method to predict the load information index value:
  • the load forecast value of the T-th cycle is obtained, and the formula is as follows:
  • Y j is the actual value of the load information index value of the jth cycle, with They are the predicted value of the load information index value of the j-1 cycle and the predicted value of the load information index value of the j cycle, with Are the second exponential smoothing value of the j-1 period and the second exponential smoothing value of the j period, Is the predicted value of the load information index value of the j+T cycle; a j and b j are intermediate parameters; ⁇ is the smoothing coefficient;
  • the collection module sends the load information index value on each node in the distributed cluster collected in the first n-1 cycles in the database to the data pre-partitioning module, and it is combined with the load information index value on each node in the current cycle.
  • Is the load data of n take the actual value of the load information index value measured in the first cycle as the initial value Y j , the initial value of the first prediction and the initial value of the second prediction, and use the obtained n load data to predict the future d cycles
  • the load information index value on each node calculate the average value P of the load information index value of a node in the future d cycles, and finally determine the load information index value of each node in the cluster;
  • Step 2 Calculate the processing capacity of each node
  • Step 3 Distribute different amounts of data according to the processing capacity of each node.
  • step 1 the value of the smoothing coefficient is obtained by calculating the standard deviation S:
  • n the number of cycles taken
  • the partial variance S is calculated by adjusting the value of the smoothing coefficient ⁇
  • the value of the corresponding smoothing coefficient ⁇ is taken when S is the smallest.
  • Step 2 includes the following steps:
  • Step 2-1 use the AHP subjective weight method to calculate:
  • a ij is the evaluation index A
  • the value obtained by comparing i with A j is an odd number between 1 and 9, that is, when the value is 1, 3, 5, 7, and 9, respectively, it means that the former index is equally important, more important, and very important than the latter index.
  • the importance of the pairwise comparison is between the importance of two adjacent odd numbers 1 and 3 and
  • a 1 , A 2 , and A 3 represent the weight value of the impact of a node's CPU utilization on the overall load of the node, the weight value of the impact of memory utilization on the overall load of the node, and the weight of the impact of bandwidth utilization on the overall load of the node.
  • Step 2-2 calculate the eigenvectors and index weights of the matrix:
  • ⁇ A ij is the sum of each column SUM j
  • B ij represents the normalized data of A ij . According to B ij, a new matrix B is obtained. The sum of the values of each column in the B matrix is 1;
  • the three index weights are finally obtained as W 1 , W 2 , W 3 ;
  • Step 2-3 check the consistency of the matrix:
  • ⁇ max is the maximum eigenvalue
  • AW represents the matrix A and the weight vector W are multiplied to obtain a column vector
  • n represents the order of the matrix
  • W represents the weight vector
  • C.I. represents the consistency index
  • n represents the order of the matrix
  • Steps 2-4 carry out the calculation of the objective weight method by entropy method:
  • the entropy method is a mathematical method that reflects the degree of influence of an indicator on the comprehensive evaluation by judging the dispersion of an indicator, and can objectively pass the variance of the indicator value To determine the weights.
  • the weight of the index is positively correlated with the degree of variability, that is, the greater the degree of variation of the index value, the greater its weight; conversely, the smaller the degree of variation of the index value, the smaller its weight.
  • CUR n , MUR n , and BUR n respectively represent the CPU utilization, memory utilization, and bandwidth utilization predicted in the nth cycle of a node;
  • E j represents the entropy value of the load information index
  • K 1/ln(n)
  • E j represents the entropy value of the CPU utilization
  • E j represents the entropy value of memory utilization
  • E j represents the entropy value of bandwidth utilization
  • D j 1-E j ;
  • Step 2-5 calculate the objective weight value WO j of each load information index:
  • Step 2-6 calculate the weight w i of the final load information index of the node:
  • is the subjective and objective weight adjustment coefficient
  • w i is the weight of the final node load
  • w 1 represents the final CPU utilization weight
  • w 2 represents the weight of the final memory utilization
  • w 3 represents the weight of the final bandwidth utilization
  • Step 3-4 compute the processing capacity of the node:
  • CA i w 1 ⁇ (1-CAU i )+w 2 ⁇ (1-MAU i )+w 3 ⁇ (1-BAU i ), (1-13)
  • CAU i , MAU i , and BAU i respectively represent the predicted CPU utilization, memory utilization, and bandwidth utilization of the i-th node in the current cycle
  • CA i represents the processing capacity of the i-th node.
  • Step 3 includes:
  • DP i represents the proportion of the amount of data that should be allocated to the i-th node
  • m represents the total number of nodes.
  • the data migration module constructs a selection queue of source and target machines by setting high and low load thresholds as conditions for triggering data migration.
  • select the source and target machines for data migration The source machine serves as the node of the data to be migrated, and the target machine serves as the node that accepts the data to be migrated, and obtains the amount of data to be migrated.
  • the data migration module constructs a selection queue of source and target machines by setting high and low load thresholds as conditions for triggering data migration.
  • select the source and target machines for data migration select the source and target machines for data migration, The source machine acts as the node of the data to be migrated, and the target machine acts as the node that accepts the data to be migrated, and obtains the amount of data that should be migrated, including the following steps:
  • Step a1 select the source machine:
  • Load i represents the overall load value of the i-th node.
  • Step a2 select the target machine: compare the overall load value of each node with the set threshold L th , if the load value of a node is lower than the threshold L th , add the node to the low-load node queue, according to
  • Step a3 perform data migration:
  • the nodes in the high-load and low-load queues will be matched and migrated in parallel in sequence.
  • the formula for the number of migrated partitions is as follows:
  • N q represents the number of partitions to be migrated
  • N y represents the number of partitions in the source machine
  • N m represents the number of partitions in the target machine
  • the low load threshold needs to be adjusted. For example, when the high load node is greater than 0.9, the number of nodes is 20, and the low load is less than There are only 10 nodes with 0.2. In this scenario, the low load threshold needs to be about 0.35, so that the high load nodes can reduce the load pressure as much as possible, and at the same time, they can perform migration in parallel and perform high and low load one-to-one node matching data transmission. ;
  • the high-load threshold was set to 0.9 before, which can be appropriately reduced to about 0.75 to make the high-load node queue
  • the number of nodes in is equal to or nearly less than the number of nodes in the low-load node queue, and then the number of migrated partitions is set according to formula 1-16;
  • the data can be migrated.
  • the system of the present invention involves the following core contents:
  • Resource monitoring can be performed by deploying cluster servers.
  • the main monitoring indicators are the utilization of CPU, memory and bandwidth.
  • the real-time cluster resource monitoring interface is used to pave the way for the acquisition module; it can be determined by combining load prediction and index weight determination methods. Whether it is a high or low load node, pave the way for the data migration module.
  • the load-based partition strategy of the present invention mainly uses CPU, memory, and bandwidth utilization to represent the overall load value of the node.
  • the collection module collects the load information of all nodes at regular intervals. If the collection cycle time is too short, it will increase the load of the central node and consume a certain amount of bandwidth, which will affect the performance of the distributed system. If the collection cycle time is too long, outdated data will be used. , Does not have real-time effects, and may cause wrong partitioning decisions during data partitioning. At the same time, when encountering emergencies, nodes that need to be balanced are not processed in time, but nodes that do not need to be balanced are processed. In order to collect the load information of nodes in a timely and more accurate manner, the method of deploying cluster resource monitoring can be used to collect resource information and save it in the cache array, and save the historical resource information to the database persistently. At present, the time interval used in most papers is between 5s and 15s, which can be collected according to the period set by the user.
  • the prediction module uses the prediction module to predict the load situation of the node in the future, so as to determine the distribution of the amount of data. After the research of relevant personnel, it is concluded that the change of host load has self-similarity and long-term dependence. For the load with such characteristics, the prediction mechanism can be used to predict the load situation of the real overall trend of the node at the time of data distribution, thereby Partition the data more effectively to prevent the occurrence of incorrect data partitioning decisions.
  • the present invention selects CPU utilization rate CUR, memory utilization rate MUR, and bandwidth utilization rate BUR to judge the load size of the node. Because there are CPU-intensive, memory-intensive, transmission-intensive and mixed types in Spark-MemSql applications, it is aimed at different In application scenarios, the weight of each indicator is likely to be different, so the weight ratio of each indicator is required. There is a problem with this load model formula. The greater the weight given to which indicator, the more the indicator affects the total value of the load. For example, the CUR, MUR, and BUR of the two nodes in the cluster are respectively ⁇ 0.9,0.2,0.2> and ⁇ 0.4,0.6,0.5>.
  • the CPU load of the previous node is very large, and it has reached the bottleneck, and the load of the latter node is relatively even.
  • load balancing should be performed on the previous nodes first, and data should not be distributed as much as possible or part of the data should be migrated to other nodes with lower load.
  • the load of the first node is 0.27, and the load of the second node is 0.54.
  • the value will select the following nodes for load balancing, which can confirm that the difference in the weight value will affect the comprehensive judgment result of the load. Therefore, it is necessary to adopt a reasonable index judgment method to determine the weight of each index, so as to determine the overall load of each node.
  • MemSql's default partitioning method is that the number of partitions for each node is the same. This will cause data skew between nodes due to the heterogeneity of cluster nodes resulting in different processing capabilities of the nodes, leading to unbalanced cluster load
  • the present invention provides a subjective and objective weight integration method that can subtly quantify the computing power of each node in the cluster, fully utilize the computing resources of each node, and improve the overall response of big data analysis applications Speed; when the cluster has a load imbalance problem, the present invention provides a dynamic load balancing strategy, which can more flexibly ensure the stability of the resource utilization of the distributed cluster; in relatively independent memory and iterative parallel execution applications, such as Association analysis, clustering, neural network and other machine learning algorithms have been widely used in the cluster analysis of event comments in the company’s public opinion analysis system and related person association analysis modules, which ultimately speeds up the response speed of the application.
  • Figure 1 is a flow chart of the dynamic data partition mechanism based on node load
  • Figure 2 is a diagram of the cluster resource monitoring interface
  • Figure 3 is a flow chart of the prediction mechanism
  • Figure 4 is a flowchart of calculating index weights by AHP analytic hierarchy process
  • Figure 5 is a flowchart of calculating index weights by entropy method
  • Figure 6 is the integration diagram of Spark and MemSql.
  • Figure 7 is a comparison diagram of the CPU utilization prediction of the correlation analysis application.
  • Figure 8 is a comparison diagram of the CPU utilization prediction of the Kmeans clustering analysis application.
  • Figure 9 is a performance comparison chart of the correlation analysis application pre-partitioning strategy.
  • Figure 10 is a comparison diagram of the performance of the Kmeans clustering analysis application pre-partitioning strategy.
  • Figure 11 is a comparison diagram of node load utilization of different pre-partitioning strategies applied by correlation analysis.
  • Figure 12 is a comparison diagram of node load utilization of different pre-partitioning strategies applied by Kmeans clustering analysis.
  • Figure 13 is a performance comparison diagram of the correlation analysis migration strategy.
  • Figure 14 is a comparison chart of Kmean clustering analysis migration strategy performance.
  • Figure 15 is a comparison diagram of the average node load utilization before and after the correlation analysis data is migrated.
  • Figure 16 is a comparison chart of average node load utilization before and after data migration of Kmeans cluster analysis.
  • the system includes a load monitoring module, a collection module, a data pre-partitioning module and data Migrate the module.
  • the entire Spark-MemSql integrated cluster has been in application use.
  • the master node in the load monitoring module regularly reads the load information of each indicator in the slave node, and dynamically displays the utilization of CPU, memory, and bandwidth in the monitoring interface; then through The collection module saves the load information in the cache array, and periodically persists it in the Mysql database to provide indicator load information for load prediction; then when a large amount of new data is imported, the prediction module in the data pre-partitioning module is required for each Each indicator of the node is predicted, and then the weight of each indicator is obtained through the indicator weight determination method.
  • the processing capacity of each node is obtained according to the indicator information after the load prediction and the weight value of each indicator, and then the processing capacity of each node is obtained according to the Processing capacity is used to distribute data and complete data pre-partitioning; if the cluster has load imbalance during the application process and reaches the set load threshold, add high and low load nodes to the source and target machine queues, and proceed according to the migration strategy Migration in blocks. After the migration, if you encounter load imbalance again, the above process is also used for dynamic migration of blocks.
  • Resource monitoring can be performed by deploying cluster servers. As shown in Figure 2, the main monitoring indicators are the utilization of CPU, memory, and bandwidth.
  • the real-time cluster resource monitoring interface is used to pave the way for the acquisition module; combined with load forecasting and The index weight determination method can determine whether it is a high or low load node, paving the way for the data migration module.
  • the load-based partition strategy of the present invention mainly uses CPU, memory, and bandwidth utilization to represent the overall load value of the node.
  • the collection module collects the load information of all nodes at regular intervals. If the collection cycle time is too short, it will increase the load of the central node and consume a certain amount of bandwidth, which will affect the performance of the distributed system. If the collection cycle time is too long, outdated data will be used. , Does not have real-time effects, and may cause wrong partitioning decisions during data partitioning. At the same time, when encountering emergencies, nodes that need to be balanced are not processed in time, but nodes that do not need to be balanced are processed.
  • the API provided by the Yarn resource management component can be used to collect resource information and save it in the cache array, and save historical resource information to the database persistently. At present, the time interval used in most papers is between 5s and 15s, which can be collected according to the period set by the user.
  • the traditional data distribution strategy is only based on the current real-time load information of the node as the basis for judging the data partition. Suppose there is an instantaneous high and low peak in the node load, and then it returns to normal. If the traditional data partitioning strategy is used, this peak value will inevitably affect the final data distribution decision making, which can easily lead to unbalanced data distribution and cause unnecessary system overhead. Therefore, it is necessary to prevent incorrect data pre-partitioning decisions caused by instantaneous load peaks. The situation happened. If the data has been allocated, but some unexpected situations occur, such as deleting nodes due to node downtime, adding nodes for horizontal expansion, and extremely unbalanced load problems, etc., it is necessary to perform block migration to balance the load. Need to refer to the load prediction module to determine the amount of migration.
  • the second exponential smoothing method is a method of exponential smoothing on the basis of the first exponential smoothing method. It cannot be predicted separately.
  • the mathematical model established by combining with the first exponential smoothing method can use this model to determine the predicted value at the next moment.
  • most forecasting models choose the quadratic exponential smoothing method. Because the one-time exponential smoothing method and the average load method are more suitable for a time series analysis of a horizontal development trend, if the actual value is rising or falling, the deviation between the predicted value and the actual value will be relatively large, and there will be an obvious lag.
  • the analysis application for the Spark-MemSql integrated framework will produce load rise or fall, and the quadratic exponential smoothing method can better solve the problem in this application scenario. It can use the law of lag deviation to find out the value change development trend. Therefore, the present invention adopts a quadratic exponential smoothing method model for load prediction.
  • the load forecast value of the T-th period can be obtained, the formula is as follows:
  • Y j is the actual value of the j-th period, with Are the predicted value of the j-1 cycle and the predicted value of the j cycle, with Are the second exponential smoothing value of the j-1 period and the second exponential smoothing value of the j period, Is the predicted value of the j+T period; a j and b j are intermediate parameters; ⁇ is the smoothing coefficient, ⁇ [0,1]. The predicted value is greatly affected by the smoothing coefficient ⁇ . The smaller the value of ⁇ , the greater the influence of historical data; the larger the value of ⁇ , the greater the influence of recent data.
  • when the data fluctuates little, a smaller value should be selected for ⁇ , such as 0.05-0.15; when the data fluctuates but the long-term fluctuation is not large, ⁇ should be selected a slightly larger value, such as 0.1-0.5; the data fluctuates greatly and long-term Also large, a larger value should be selected for ⁇ , such as 0.6-0.8; when the data is obviously rising or falling, a larger value should be selected for ⁇ , such as 0.6-1.
  • n represents the number of cycles taken
  • j represents the j-th cycle.
  • the flow of the prediction mechanism is shown in Figure 3.
  • the partial variance S is calculated by adjusting the value of the smoothing coefficient ⁇ , and the value of the corresponding smoothing coefficient ⁇ is taken when S is the smallest.
  • the values of n and d are set by the user.
  • the present invention calculates the overall load value of each node through the index weight determination method based on the combination of the secondary smooth load prediction method + subjective and objective AHP and the entropy index weight integration method, and finally allocates the corresponding data amount according to the overall load value .
  • a 1 , A 2 , and A 3 represent the weight value of the impact of a node's CPU utilization on the overall load of the node, the weight value of the impact of memory utilization on the overall load of the node, and the weight of the impact of bandwidth utilization on the overall load of the node.
  • ⁇ A ij is the sum of each column SUM j , and a new matrix B is obtained.
  • the sum of the values of each column in the B matrix is 1.
  • the index weights of the three query modes are W 1 , W 2 , W 3 .
  • ⁇ max is the maximum characteristic root
  • AW represents the matrix A and the weight vector W are multiplied to obtain a column vector
  • n represents the order of the matrix
  • W represents the weight vector
  • C.I. represents the consistency index
  • n represents the order of the matrix
  • R.I. represents the average random consistency index, which is a constant, which can be queried in the scale according to the order.
  • the fourth-order R.I. 0.89, if C.R. ⁇ 0.1, it means that the contrast matrix remains consistent. If C.R.>0.1, it means that the contrast matrix is not consistent and needs to be adjusted.
  • Entropy method is a mathematical method that reflects the degree of influence of an index on comprehensive evaluation by judging the dispersion of an index, and can objectively determine the weight through the degree of variation of the index value.
  • the weight of the index is positively correlated with the degree of variability, that is, the greater the degree of variation of the index value, the greater its weight; conversely, the smaller the degree of variation of the index value, the smaller its weight.
  • the process of calculating the index weight by the entropy method is shown in Figure 5.
  • n the number of cycles
  • CUR, MUR, and BUR represent the utilization of CPU, memory, and bandwidth, respectively.
  • is the subjective and objective weight adjustment coefficient
  • Node data distribution First, the subjective and objective integration weights of the three indicators of CPU, memory, and bandwidth in the load are obtained from the previous module, which are w 1 , w 2 , and w 3 respectively .
  • CA i w 1 ⁇ (1-CAU i )+w 2 ⁇ (1-MAU i )+w 3 ⁇ (1-BAU i ), (1-13)
  • CAU i , MAU i , and BAU i represent the predicted utilization of CPU, memory, and bandwidth respectively, and i represents the i-th node.
  • DP i represents the proportion of the amount of data that should be allocated to the i-th node
  • m represents the total number of nodes.
  • a selection queue of source and target machines is constructed. After the data is pre-partitioned, load imbalance or addition or deletion of nodes occurs, you need to select the source and target machines for data migration.
  • the source machine is the node of the data to be migrated
  • the target machine is the node that accepts the data to be migrated. The number of partitions.
  • the load utilization prediction value of each indicator is combined with the subjective and objective weight integration method to obtain the load weight value of each indicator, and then the overall load value Load i of each node is obtained.
  • the load value formula is as follows:
  • CUR i , MUR i , BUR i and w 1 , w 2 , w 3 are the predicted CPU utilization, memory utilization, bandwidth utilization and weight values, respectively.
  • the load value Load i of each node is compared with the set threshold, and if the load value of a certain node exceeds the H th threshold, the node is added to the high-load node queue.
  • the source machine selection queue Sy ⁇ s 1 , s 2 ,..., s m ⁇ is formed according to the overall load value from large to small.
  • the load value of each node in the Sy queue is sorted in descending order, and the source machine is selected in descending order of the overall load value.
  • the load value Load i of each node is compared with the set threshold, and if the load value of a certain node is lower than the L th threshold, the node is added to the low-load node queue.
  • the target machine selection queue D m ⁇ d 1 , d 2 ,..., d z ⁇ is formed according to the value of Load i from small to large.
  • N q represents the number of partitions to be migrated
  • N y represents the number of partitions in the source machine
  • N m represents the number of partitions in the target machine.
  • the migration can be performed in parallel to reduce the migration overhead.
  • the system can achieve load balancing. For unexpected situations where nodes are added or deleted, this migration strategy can also be adopted.
  • the distributed memory database MemSql adopts a master-slave structure, uses Hash as a storage method, and uses a data partition as the smallest storage unit block. Spark also uses a master-slave structure.
  • the Master node (master node) manages the resources of the entire cluster, and the Worker node (slave node) manages the resources of each computing node, regularly reports the node resource status to the Master node, and starts the Executor to perform calculations.
  • Spark and MemSql have two combined application scenarios: one is Spark and MemSql, which are two relatively independent frameworks, and the other is the integration framework of Spark and MemSql.
  • the method of localized data reading and analysis is adopted, and the two are integrated through the MemSql Spark Connector component.
  • the component is started in the background as a daemon process to connect the Master and Spark in Spark.
  • the main aggregator in MemSql is connected, and then the Worker node of Spark can obtain the metadata information of the main aggregator in MemSql through the Master node.
  • the metadata includes which nodes the data exists and which partitions on the nodes, thereby ensuring the actual program
  • the Worker node of Spark uses the MemSqlRDD interface to localize and perform data reading and writing, calculation and analysis from the MemSql storage Leaf node in parallel.
  • the smallest storage granularity in MemSql is Partition.
  • each node is assigned the same number of Partitions by default. This will cause data skew between nodes due to the heterogeneity of cluster nodes resulting in different processing capabilities of nodes. Since Spark in this framework adopts the method of localized data analysis, that is, on which node the data is located, it is analyzed and processed on the corresponding node.
  • the number of Partitions in MemSql directly reflects the number of RDD tasks in Spark, that is, the number of tasks and the number of partitions are positive.
  • the default partitioning method it will cause serious load imbalance. For example, if there are many sub-blocks in a high-load data node that need to be processed and analyzed, then the execution time of the entire job will become longer because of the execution of Spark job scheduling. The time is up to the time when all jobs are completed. In real applications, the problem of data skew is widespread, and the unbalanced load of processing nodes caused by it is an inevitable problem in the application of the Spark-MemSql framework.
  • the Spark-MemSql integrated cluster environment is deployed under the local area network. There are 5 nodes in the experiment, and the total number of partitions is set to 32 partitions. Using a data set in a manufacturing company, the verification is based on load forecasting and AHP The effectiveness of the data dynamic partition strategy combined with the entropy integrated weight method.
  • the FIS_PRODUCT table in a certain manufacturing company is used as the test data set.
  • Table 1 there are approximately more than 50 million rows of data.
  • Each piece of data includes time ID, plant category, product category, product length, product stretch length, product weight, etc.
  • the LENGTH and WEIGHT columns are used as the data set for the correlation analysis application test.
  • the LENGTH, DRAWLENGTH and WEIGHT columns can be used as the data set for the Kmeans application test. Different applications use different data sets for testing.
  • Test and verify the prediction module Run related applications to imitate the actual application environment under the Spark-MemSql integration framework, predict the load utilization rate in the application environment, every 5s is a cycle, and then calculate the deviation between the forecast and the actual value to adjust the smoothing coefficient. Pave the way for the comparison test of the partition strategy, and verify the effectiveness of the prediction algorithm in this application scenario.
  • the test process of the prediction module is: read the collected historical load information, then use the secondary smoothing prediction algorithm to predict the load, calculate the partial variance S between the predicted value and the true value, and reduce the partial variance S by adjusting the smoothing coefficient ⁇ . Use the same method to adjust the smoothing coefficient for different application scenarios.
  • load forecasting + AHP weighting method By comparing the default pre-partitioning strategy, load forecasting + AHP weighting method, load forecasting + entropy weighting method pre-partitioning strategy and load forecasting + AHP and entropy integrated weighting method of four different pre-partitioning strategies, and then statistically execute the same application. Time to verify the effectiveness of the program.
  • Implementation step 1 Load prediction algorithm. Test different applications separately, collect and predict the load of a certain node, verify the effectiveness of this prediction algorithm in different application scenarios, and obtain the smoothing coefficient ⁇ of different load indicators in different application scenarios. As shown in Figure 7 and Figure 8, there are fluctuations in the CPU utilization of the two different applications. The quadratic smoothing index method can more accurately predict the CPU utilization and avoid the impact of instantaneous peaks. The same method is used to predict and compare different indicators, and finally the smoothing coefficient ⁇ of different indicators in different application scenarios is obtained, as shown in Tables 2 and 3.
  • Implementation step two pre-partitioning strategy. Divided into two groups of experiments through different pre-partitioning strategies. Each group of experiments runs the same application. The first group of experiments is used for the application of association analysis; the second group of experiments is used for the application of Kmeans cluster analysis. Compare the execution time of applications under different partitioning strategies to verify the effectiveness of the scheme.
  • the association analysis and Kmeans clustering application are performed respectively.
  • the default partition strategy has the worst effect.
  • the partition strategy of the prediction + AHP and entropy weight integration method designed in this paper has the best effect, and the effect is more significant as the amount of data increases.
  • the AHP weighting method is a subjective weighting method, which does not match the weights according to the actual application scenarios, which is unobjective; the entropy weighting method is obtained by using the difference of the index value, and the memory utilization rate changes slowly, but it has been frequently used.
  • the data calculation of the Spark-MemSql framework is carried out in the memory, so the memory has been used relatively steadily, and the bandwidth utilization rate varies greatly, but the utilization rate is very low.
  • the same application is executed for different pre-partitioning strategies, and the overall average load utilization rate of each node in the entire application process is calculated.
  • the default partitioning strategy has serious load imbalance.
  • the pre-partitioning strategy combining prediction + AHP, prediction + entropy method, prediction + AHP and entropy weight integration method can solve the cluster load problem well. The balance of cluster load.
  • Implementation step three migration strategy. When encountering load imbalance in the Spark-MemSql framework, through the data migration strategy, and then run the same application, periodically record the load status of different nodes through the monitoring interface, compare the execution time of the application before and after the migration, and consider The time cost of the migration is reduced, and the effectiveness of the scheme is verified.
  • Figure 13 and Figure 14 show the effectiveness of the migration strategy, which can improve the load balance of the cluster and improve the response speed of the application to a certain extent.
  • the amount of data is small, that is, when the amount of data in the correlation analysis application is less than 30 million, and when the amount of data in the Kmeans analysis application is less than 20 million, the load does not reach the set threshold, and migration is not triggered, but when it is executed
  • the amount of data is relatively large, that is, when the amount of data in the correlation analysis application reaches 30 million and the data volume in the Kmeans analysis application reaches 20 million, the load reaches the threshold and the migration is triggered.
  • the load balance of the cluster is improved, the migration takes time Cost, resulting in a longer total time.
  • load imbalance intensifies, resulting in relatively small migration overhead and improving the response speed of the application.
  • the migration test is performed on different applications, and the overall average load utilization rate of each node in the entire application process before and after the migration is compared. It can be seen that the migration can improve the balance of the cluster load.
  • the partitioning strategy based on load prediction + AHP index weight judgment has the best effect, which can solve the load balance of the cluster and improve the response speed of the application; after the data has been distributed, the load is unbalanced In the case of migration, the load balance of the cluster can be solved and the response speed of the application can be improved.
  • the present invention provides a data dynamic partition system based on node load.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A node load-based dynamic data partitioning system, comprising load monitoring and collection, prediction, data pre-partitioning, data migration and like modules; a secondary smoothing method is used to predict node load, and AHP and an entropy index weight method are combined so that corresponding partitioning policies may be obtained according to different data analysis applications, thus the load balance of the system may be dynamically adjusted, and the response speed of applications may be improved. For the application scenario of a Spark and Memsql distributed integrated framework, since heterogeneity of node resources exists in a distributed environment, the computing resources of each node are fully used in order to reduce the consumption of data transmission between nodes, and the parallel computing efficiency of application analysis is improved by means of load balance. Therefore, such a node load-based data dynamic partitioning mechanism and policy are proposed so as to improve the load balance of a system and increase the response speed of applications, thus assisting relevant personnel in completing complete decision-making.

Description

一种基于节点负载的数据动态分区系统A data dynamic partition system based on node load 技术领域Technical field
本发明涉及大数据分布式计算存储领域,尤其涉及一种基于节点负载的数据动态分区系统。The invention relates to the field of big data distributed computing and storage, in particular to a data dynamic partition system based on node load.
背景技术Background technique
大数据的发展直接推动了各种分布式计算框架的发展,HBASE、HDFS、MemSql等优秀的分布式存储框架相继出现。但是众多的存储框架都存在因为分区不合理引发数据倾斜而导致集群负载不均衡的现象。为了提高集群数据分析处理实时性性能,有必要对集群进行数据分区策略的研究。数据分区是指分布式系统环境下的数据分布,需按照一定的方法策略,采用分区策略将整个数据合理地存放到集群中每个物理数据节点上。简单的数据分区容易做到,但是要让系统高效而稳健的运行则需要研究并设计出相应的分区策略。设计不当的数据分区策略会导致计算低效、接入成本高和网络负载重。在分布式系统分区策略设计中,数据分区应秉承的基本原则是:改善节点负载均衡性,提高数据分析应用的响应效率,及时地为企业提供决策,增加效益。The development of big data has directly promoted the development of various distributed computing frameworks. Excellent distributed storage frameworks such as HBASE, HDFS, and MemSql have appeared one after another. However, many storage frameworks have the phenomenon of unbalanced cluster load caused by data skew caused by unreasonable partitioning. In order to improve the real-time performance of cluster data analysis and processing, it is necessary to study the cluster data partition strategy. Data partitioning refers to the distribution of data in a distributed system environment. It is necessary to follow a certain method strategy and adopt a partitioning strategy to store the entire data reasonably on each physical data node in the cluster. Simple data partitioning is easy to do, but to make the system run efficiently and steadily, it is necessary to study and design a corresponding partitioning strategy. An improperly designed data partition strategy can lead to inefficient calculations, high access costs, and heavy network load. In the design of distributed system partitioning strategy, the basic principles of data partitioning should be: improve node load balance, improve the response efficiency of data analysis applications, provide timely decision-making for enterprises, and increase benefits.
发明内容Summary of the invention
发明目的:本发明要解决的技术问题是提供一种面向Spark的Memsql分区策略的系统,动态调整分布式计算的负载均衡性,提高数据分析的响应速度。Objective of the invention: The technical problem to be solved by the present invention is to provide a system of Memsql partition strategy for Spark, which dynamically adjusts the load balance of distributed computing and improves the response speed of data analysis.
技术方案:本发明提供了一种基于节点负载的数据动态分区系统,所述系统是基于节点负载的数据动态分区机制和策略的系统。所述系统包括负载监测模块、采集模块、数据预分区模块和数据迁移模块;Technical Solution: The present invention provides a data dynamic partition system based on node load, which is a system based on a data dynamic partition mechanism and strategy of node load. The system includes a load monitoring module, an acquisition module, a data pre-partitioning module, and a data migration module;
所述负载监测模块用于,选取负载信息指标,并实时性地监测分布式集群中每个节点上的负载信息指标值;The load monitoring module is used to select load information indicators and monitor the load information indicator values on each node in the distributed cluster in real time;
所述采集模块用于,周期性的采集分布式集群中每个节点上的负载信息指标值;The collection module is used to periodically collect the load information index value on each node in the distributed cluster;
所述数据预分区模块用于,预测分布式集群中每个节点上的负载信息指标值,然后根据指标权重方法得到每个节点的处理能力,最后根据每个节点的处理能力分布不同的数据量,完成数据预分区;The data pre-partitioning module is used to predict the load information index value on each node in the distributed cluster, and then obtain the processing capacity of each node according to the index weight method, and finally distribute different amounts of data according to the processing capacity of each node , Complete data pre-partitioning;
所述数据迁移模块用于,当分布式集群出现负载不均衡问题时,触发节点之间的数据迁移来提高负载均衡性。The data migration module is used to trigger data migration between nodes to improve load balance when a load imbalance problem occurs in the distributed cluster.
所述负载监测模块选取CPU利用率、内存利用率及带宽的利用率作为负载信息指标值,通过部署Memsql(分布式内存数据库)资源监测服务实时性地监测分布式集群中每个节点上的负载信息指标值。The load monitoring module selects CPU utilization, memory utilization, and bandwidth utilization as load information index values, and monitors the load on each node in the distributed cluster in real time by deploying Memsql (distributed memory database) resource monitoring service Information index value.
所述采集模块通过分布式Yarn资源管理组件提供的API(程序接口)周期性获得分布式集群中每个节点上的负载信息指标值,并保存到数据库中。The collection module periodically obtains the load information index value on each node in the distributed cluster through the API (program interface) provided by the distributed Yarn resource management component, and saves it in the database.
所述数据预分区模块用于,预测分布式集群中每个节点上的负载信息指标值,然后根据AHP(层次分析法)和熵值主客观指标权重集成方法得到每个节点的处理能力,最后根据每个节点的处理能力分布不同的数据量,完成数据预分区,具体包括如下步骤:The data pre-partitioning module is used to predict the load information index value on each node in the distributed cluster, and then obtain the processing capacity of each node according to the AHP (Analytic Hierarchy Process) and entropy subjective and objective index weight integration method, and finally According to the processing capacity of each node, different data volumes are distributed to complete the data pre-partitioning, which specifically includes the following steps:
步骤1,采用二次指数平滑法进行负载信息指标值预测: Step 1. Use the quadratic exponential smoothing method to predict the load information index value:
一次指数平滑法公式如下:The formula of the one-time exponential smoothing method is as follows:
Figure PCTCN2020090554-appb-000001
Figure PCTCN2020090554-appb-000001
二次指数平滑法公式如下:The formula of the quadratic exponential smoothing method is as follows:
Figure PCTCN2020090554-appb-000002
Figure PCTCN2020090554-appb-000002
综合一、二次指数平滑公式,得出第T个周期的负载预测值,公式如下:Combining the first and second exponential smoothing formulas, the load forecast value of the T-th cycle is obtained, and the formula is as follows:
Figure PCTCN2020090554-appb-000003
Figure PCTCN2020090554-appb-000003
其中,Y j是第j个周期的负载信息指标值的实际值,
Figure PCTCN2020090554-appb-000004
Figure PCTCN2020090554-appb-000005
分别是第j-1个周期的负载信息指标值的预测值和第j个周期的负载信息指标值的预测值,
Figure PCTCN2020090554-appb-000006
Figure PCTCN2020090554-appb-000007
分别为第j-1个周期的二次指数平滑值和第j个周期的二次指数平滑值,
Figure PCTCN2020090554-appb-000008
是第j+T个周期的负载信息指标值的预测值;a j和b j是中间参数;α是平滑系数;
Among them, Y j is the actual value of the load information index value of the jth cycle,
Figure PCTCN2020090554-appb-000004
with
Figure PCTCN2020090554-appb-000005
They are the predicted value of the load information index value of the j-1 cycle and the predicted value of the load information index value of the j cycle,
Figure PCTCN2020090554-appb-000006
with
Figure PCTCN2020090554-appb-000007
Are the second exponential smoothing value of the j-1 period and the second exponential smoothing value of the j period,
Figure PCTCN2020090554-appb-000008
Is the predicted value of the load information index value of the j+T cycle; a j and b j are intermediate parameters; α is the smoothing coefficient;
采集模块将数据库中的前n-1个周期采集的分布式集群中每个节点上的负载信息指标值发送给数据预分区模块,并与当前周期的每个节点上的负载信息指标值组成大小为n的负载数据,取第一个周期测量的负载信息指标值的实际值作为初值Y j、一次预测初值及二次预测初值,使用得到的n个负载数据预测未来d个周期的每个节点上的负载信息指标值,计算一个节点未来d个周期的负载信息指标值的平均值P,最终确定集群中每个节点的负载信息指标值; The collection module sends the load information index value on each node in the distributed cluster collected in the first n-1 cycles in the database to the data pre-partitioning module, and it is combined with the load information index value on each node in the current cycle. Is the load data of n, take the actual value of the load information index value measured in the first cycle as the initial value Y j , the initial value of the first prediction and the initial value of the second prediction, and use the obtained n load data to predict the future d cycles The load information index value on each node, calculate the average value P of the load information index value of a node in the future d cycles, and finally determine the load information index value of each node in the cluster;
步骤2,计算每个节点的处理能力; Step 2. Calculate the processing capacity of each node;
步骤3,根据每个节点的处理能力分布不同的数据量。Step 3: Distribute different amounts of data according to the processing capacity of each node.
步骤1中,通过计算标准偏差S来获得平滑系数的值:In step 1, the value of the smoothing coefficient is obtained by calculating the standard deviation S:
Figure PCTCN2020090554-appb-000009
Figure PCTCN2020090554-appb-000009
其中,n代表取的周期数,通过调整平滑系数α值来计算偏方差S,取S最小时对应的平滑系数α值。Among them, n represents the number of cycles taken, the partial variance S is calculated by adjusting the value of the smoothing coefficient α, and the value of the corresponding smoothing coefficient α is taken when S is the smallest.
步骤2包括如下步骤: Step 2 includes the following steps:
步骤2-1,使用AHP主观权重方法进行计算:多属性决策中,由决策者对所有评价指标进行两两比较,得到判断矩阵U=(A ij) n×n,其中A ij为评价指标A i与A j比较而得的数值,取值为1至9之间的奇数,即取值为1、3、5、7、9时分别表示前者指标比后者指标同等重要、较重要、很重要、非常重要、极其重要;当取值为1至9之间的偶数时,分别表示两两相比的重要程度介于两个相邻奇数所表示重要性程度之间,即取值为2时表示两两相比的重要程度介于两个相邻奇数1和3所表示重要性程度之间,且
Figure PCTCN2020090554-appb-000010
Step 2-1, use the AHP subjective weight method to calculate: In multi-attribute decision-making, the decision maker compares all evaluation indicators pairwise to obtain the judgment matrix U=(A ij ) n×n , where A ij is the evaluation index A The value obtained by comparing i with A j is an odd number between 1 and 9, that is, when the value is 1, 3, 5, 7, and 9, respectively, it means that the former index is equally important, more important, and very important than the latter index. Important, very important, and extremely important; when the value is an even number between 1 and 9, it means that the importance of the pairwise comparison is between the importance degrees of two adjacent odd numbers, that is, the value is 2. When indicates that the importance of the pairwise comparison is between the importance of two adjacent odd numbers 1 and 3, and
Figure PCTCN2020090554-appb-000010
对CPU利用率、内存利用率和带宽利用率两两比较,得到判断矩阵A:Comparing CPU utilization, memory utilization and bandwidth utilization in pairs, the judgment matrix A is obtained:
Figure PCTCN2020090554-appb-000011
Figure PCTCN2020090554-appb-000011
其中,A 1,A 2,A 3分别代表一个节点的CPU利用率对节点整体负载影响的权重值、内存利用率对节点整体负载影响的权重值和带宽利用率对节点整体负载影响的权重值。对判断矩阵A每列进行归一化操作,求取列特征向量,再对每行进行归一化操作,求取行特征向量,最后得出每种指标的权重配比,并对判断矩阵A进行一致性检验,最终得到一个节点的CPU、内存和带宽的主观权重分别为WS 1,WS 2,WS 3,并且WS 1+WS 2+WS 3=1; Among them, A 1 , A 2 , and A 3 represent the weight value of the impact of a node's CPU utilization on the overall load of the node, the weight value of the impact of memory utilization on the overall load of the node, and the weight of the impact of bandwidth utilization on the overall load of the node. . Perform normalization operations on each column of the judgment matrix A, obtain the column eigenvectors, and then normalize each row to obtain the row eigenvectors, and finally obtain the weight ratio of each indicator, and compare the judgment matrix A Perform consistency check, and finally get the subjective weights of CPU, memory, and bandwidth of a node as WS 1 , WS 2 , WS 3 , and WS 1 +WS 2 +WS 3 =1;
步骤2-2,计算矩阵的特征向量和指标权重:Step 2-2, calculate the eigenvectors and index weights of the matrix:
对矩阵各列求和,列和的向量为:SUM jSum the columns of the matrix, the vector of the column sum is: SUM j ;
对矩阵每一列进行归一化处理,公式如下:To normalize each column of the matrix, the formula is as follows:
Figure PCTCN2020090554-appb-000012
Figure PCTCN2020090554-appb-000012
∑A ij的值为各列的和SUM j,B ij表示A ij归一化后的数据,根据B ij得到新矩阵B,B矩阵中每一列值的和都为1; The value of ∑A ij is the sum of each column SUM j , and B ij represents the normalized data of A ij . According to B ij, a new matrix B is obtained. The sum of the values of each column in the B matrix is 1;
对矩阵B每一行求和,即得出特征向量SUM iSum each row of matrix B to obtain the eigenvector SUM i ;
计算指标权重,对特征向量进行归一化处理,公式如下:Calculate the index weight and normalize the feature vector, the formula is as follows:
Figure PCTCN2020090554-appb-000013
Figure PCTCN2020090554-appb-000013
根据上述公式,最终得到三种指标权重分别为W 1,W 2,W 3According to the above formula, the three index weights are finally obtained as W 1 , W 2 , W 3 ;
步骤2-3,进行矩阵一致性检验:Step 2-3, check the consistency of the matrix:
为了检验得出指标权重是否正确,需要对指标进行比较,例如:如果A>B,B>C,那么必须得出A>C,反之,则一致性不成立。所以需要对矩阵的一致性进行检验,确保没有出现以上的错误。In order to test whether the index weight is correct, it is necessary to compare the indexes. For example, if A>B, B>C, then A>C must be obtained, otherwise, the consistency is not established. Therefore, it is necessary to check the consistency of the matrix to ensure that the above errors do not occur.
计算矩阵的最大特征根,公式如下:Calculate the largest characteristic root of the matrix, the formula is as follows:
Figure PCTCN2020090554-appb-000014
Figure PCTCN2020090554-appb-000014
其中,λ max为最大特征根,AW表示矩阵A和权重向量W相乘,得到一个列向量,n代表矩阵阶数,W代表权重向量; Among them, λ max is the maximum eigenvalue, AW represents the matrix A and the weight vector W are multiplied to obtain a column vector, n represents the order of the matrix, and W represents the weight vector;
计算判断矩阵的一致性(Constant index)指标,公式如下:To calculate the constant index of the judgment matrix, the formula is as follows:
Figure PCTCN2020090554-appb-000015
Figure PCTCN2020090554-appb-000015
其中,C.I.代表一致性指标,n表示矩阵的阶数;Among them, C.I. represents the consistency index, and n represents the order of the matrix;
计算随机一致性比率C.R.,计算公式如下:Calculate the random consistency ratio C.R., the calculation formula is as follows:
Figure PCTCN2020090554-appb-000016
Figure PCTCN2020090554-appb-000016
其中,R.I.代表平均随机一致性指标,是一个常量,根据阶数可以在量表里查询;3阶R.I.=0.89,如果C.R.<0.1,说明对比矩阵保持一致性;如果C.R.>0.1,则表示对比矩阵不具有一致性,需要进行调整;Among them, RI stands for the average random consistency index, which is a constant, which can be queried in the scale according to the order; the third-order RI=0.89, if CR<0.1, it means that the comparison matrix is consistent; if CR>0.1, it means comparison The matrix is not consistent and needs to be adjusted;
步骤2-4,进行熵值法客观权重法计算:熵值法是一种通过判断某个指标的离散度来反映该指标对综合评价的影响程度的数学方法,能够通过指标值的变异度客观地确定权重。指标的权重与变异度呈正相关关系,即指标值的变异程度越大,其权重越大;反之,指标值的变异程度越小,其权重越小。Steps 2-4, carry out the calculation of the objective weight method by entropy method: the entropy method is a mathematical method that reflects the degree of influence of an indicator on the comprehensive evaluation by judging the dispersion of an indicator, and can objectively pass the variance of the indicator value To determine the weights. The weight of the index is positively correlated with the degree of variability, that is, the greater the degree of variation of the index value, the greater its weight; conversely, the smaller the degree of variation of the index value, the smaller its weight.
构建负载信息决策矩阵M:Construct load information decision matrix M:
Figure PCTCN2020090554-appb-000017
Figure PCTCN2020090554-appb-000017
其中,CUR n、MUR n、BUR n分别表示一个节点的第n个周期预测的CPU利用率、内存利用率和带宽的利用率; Among them, CUR n , MUR n , and BUR n respectively represent the CPU utilization, memory utilization, and bandwidth utilization predicted in the nth cycle of a node;
对决策矩阵M每列进行标准化处理得到决策矩阵R:Standardize each column of the decision matrix M to obtain the decision matrix R:
Figure PCTCN2020090554-appb-000018
Figure PCTCN2020090554-appb-000018
其中
Figure PCTCN2020090554-appb-000019
R i1表示决策矩阵R第i行第1列的元素,决策矩阵R每一列满足归一性,即
Figure PCTCN2020090554-appb-000020
即每一列值的和为1,j=1,2,3;
among them
Figure PCTCN2020090554-appb-000019
R i1 represents the element in the ith row and the first column of the decision matrix R, and each column of the decision matrix R satisfies the normalization, namely
Figure PCTCN2020090554-appb-000020
That is, the sum of each column value is 1, j = 1, 2, 3;
根据如下公式计算负载信息指标的熵:Calculate the entropy of the load information index according to the following formula:
Figure PCTCN2020090554-appb-000021
Figure PCTCN2020090554-appb-000021
E j代表负载信息指标的熵值,常数K=1/ln(n),则0≤E j≤1,即E j最大为1,j为1时,E j表示CPU利用率的熵值;j为2时,E j表示内存利用率的熵值;j为3时,E j表示带宽的利用率的熵值; E j represents the entropy value of the load information index, and the constant K = 1/ln(n), then 0≤E j ≤1, that is, the maximum E j is 1, and when j is 1, E j represents the entropy value of the CPU utilization; When j is 2, E j represents the entropy value of memory utilization; when j is 3, E j represents the entropy value of bandwidth utilization;
定义D j为第j个负载信息指标E j的贡献度:D j=1-E jDefine D j as the contribution degree of the j-th load information index E j : D j =1-E j ;
步骤2-5,计算每种负载信息指标的客观权重值WO jStep 2-5, calculate the objective weight value WO j of each load information index:
Figure PCTCN2020090554-appb-000022
Figure PCTCN2020090554-appb-000022
WO 1,WO 2,WO 3分别代表CPU对于节点负载影响的客观权重值、内存对于节点负载影响的客观权重值和带宽对于节点负载影响的客观权重值,并且WO 1+WO 2+WO 3=1; WO 1 , WO 2 , WO 3 respectively represent the objective weight value of CPU's impact on node load, the objective weight value of memory on node load, and the objective weight of bandwidth on node load, and WO 1 +WO 2 +WO 3 = 1;
步骤2-6,计算节点的最终的负载信息指标的权重w i Step 2-6, calculate the weight w i of the final load information index of the node:
w i=β×WS i+(1-β)×WO i,         (1-12) w i =β×WS i +(1-β)×WO i , (1-12)
其中β为主客观权重调整系数,w i为最终节点负载的权重,其中i=1,2,3,并且w 1+w 2+w 3=1,w 1表示最终的CPU利用率的权重,w 2表示最终的内存利用率的权重,w 3表示最终的带宽的利用率的权重; Where β is the subjective and objective weight adjustment coefficient, w i is the weight of the final node load, where i = 1, 2, 3, and w 1 + w 2 + w 3 = 1, w 1 represents the final CPU utilization weight, w 2 represents the weight of the final memory utilization, w 3 represents the weight of the final bandwidth utilization;
步骤3-4,计算节点的处理能力:Step 3-4, compute the processing capacity of the node:
CA i=w 1×(1-CAU i)+w 2×(1-MAU i)+w 3×(1-BAU i),       (1-13) CA i = w 1 ×(1-CAU i )+w 2 ×(1-MAU i )+w 3 ×(1-BAU i ), (1-13)
其中,CAU i、MAU i、BAU i分别代表预测得到的第i个节点当前周期的CPU利用率、内存利用率、带宽利用率,CA i表示第i个节点处理能力。 Among them, CAU i , MAU i , and BAU i respectively represent the predicted CPU utilization, memory utilization, and bandwidth utilization of the i-th node in the current cycle, and CA i represents the processing capacity of the i-th node.
步骤3包括:Step 3 includes:
计算每个节点要分配的数据量的占比:Calculate the proportion of the amount of data to be distributed by each node:
Figure PCTCN2020090554-appb-000023
Figure PCTCN2020090554-appb-000023
其中DP i代表第i个节点应分配的数据量占比,m表示节点总数。 Among them, DP i represents the proportion of the amount of data that should be allocated to the i-th node, and m represents the total number of nodes.
所述数据迁移模块通过设置高、低负载阈值来作为触发数据迁移的条件,构造出源机和目标机的选择队列,在出现负载不均衡问题时,选择源机和目标机来进行数据迁移,源机作为待迁移数据的节点,目标机作为接受迁移数据的节点,并获得应迁移的数据量。The data migration module constructs a selection queue of source and target machines by setting high and low load thresholds as conditions for triggering data migration. When a load imbalance problem occurs, select the source and target machines for data migration, The source machine serves as the node of the data to be migrated, and the target machine serves as the node that accepts the data to be migrated, and obtains the amount of data to be migrated.
所述数据迁移模块通过设置高、低负载阈值来作为触发数据迁移的条件,构造出源机和目标机的选择队列,在出现负载不均衡问题时,选择源机和目标机来进行数据迁移,源机作为待迁移数据的节点,目标机作为接受迁移数据的节点,并获得应迁移的数据量,具体包括如下步骤:The data migration module constructs a selection queue of source and target machines by setting high and low load thresholds as conditions for triggering data migration. When a load imbalance problem occurs, select the source and target machines for data migration, The source machine acts as the node of the data to be migrated, and the target machine acts as the node that accepts the data to be migrated, and obtains the amount of data that should be migrated, including the following steps:
步骤a1,选择源机:Step a1, select the source machine:
计算每个节点的整体负载值:Calculate the overall load value of each node:
Load i=w 1×CUR i+w 2×MUR i+w 3×BUR i,     (1-15) Load i = w 1 ×CUR i +w 2 ×MUR i +w 3 ×BUR i , (1-15)
其中,Load i表示第i个节点的整体负载值,将每个节点的整体负载值与设置的阈值H th进行比较,如果一个节点的整体负载值超过阈值H th,则将所述节点加入到高负载节点队列中,按照整体负载值由大到小构成源机选择队列S y={s 1,s 2,……,s m},s m表示队列S y中第m个节点,即整体负载值最小的节点; Wherein, Load i represents the overall load value of the i-th node. The overall load value of each node is compared with the set threshold H th . If the overall load value of a node exceeds the threshold H th , the node is added to high load queue node according to the descending integral source load value selection unit queues S y = {s 1, s 2, ......, s m}, s m S y represents a queue of the m-th node, i.e. a whole The node with the smallest load value;
对S y队列中的每个节点,按整体负载值从大到小的顺序进行源机的选择; For each node in the Sy queue, select the source machine according to the overall load value in descending order;
步骤a2,选择目标机:将每个节点的整体负载值与设置的阈值L th进行比较,如果一个节点的负载值低于阈值L th,则将所述节点加入到低负载节点队列中,按照整体负载值值由小到大构成目标机选择队列D m={d 1,d 2,……,d z},d z表示队列D m中第z个节点,即整体负载值最大的节点; Step a2, select the target machine: compare the overall load value of each node with the set threshold L th , if the load value of a node is lower than the threshold L th , add the node to the low-load node queue, according to The overall load value from small to large constitutes the target machine selection queue D m ={d 1 , d 2 ,..., d z }, d z represents the z-th node in the queue D m , that is, the node with the largest overall load value;
对D m队列中的每个节点,按整体负载值从小到大的顺序进行目标机的选择; For each node in the D m queue, select the target machine in the order of the overall load value from small to large;
步骤a3,进行数据迁移:Step a3, perform data migration:
如果高、低负载队列节点数目相同,即m=z,则分别将高、低负载队列中的节点按照顺序进行匹配并行迁移,迁移的分区数公式如下:If the number of nodes in the high-load and low-load queues is the same, that is, m=z, the nodes in the high-load and low-load queues will be matched and migrated in parallel in sequence. The formula for the number of migrated partitions is as follows:
Figure PCTCN2020090554-appb-000024
Figure PCTCN2020090554-appb-000024
其中N q代表迁移的分区数,N y代表源机中的分区数,N m代表目标机中的分区数; Where N q represents the number of partitions to be migrated, N y represents the number of partitions in the source machine, and N m represents the number of partitions in the target machine;
如果高负载队列节点数目大于低负载节点数目,即S y>D m,则适当调整低负载阈值,使低负载节点队列中的节点数目等于或近大于高负载节点队列中的节点数目,接着按照公式1-16设定迁移的分区数,为了减少节点数据之间不必要的传输而达到负载均衡,需要调整低负载的阈值,例如当高负载节点大于0.9的节点为20台,而低负载小于0.2的节点只有10台,此种场景就需要把低负载阈值为0.35左右,让高负载节点尽可能减少负载压力,同时又可以更好地并行执行迁移,进行高低负载一对一节点匹配数据传输; If the number of high-load queue nodes is greater than the number of low-load nodes, that is, Sy > D m , adjust the low-load threshold appropriately so that the number of nodes in the low-load node queue is equal to or nearly greater than the number of nodes in the high-load node queue, and then follow Formula 1-16 sets the number of partitions to be migrated. In order to reduce unnecessary transmission of data between nodes and achieve load balancing, the low load threshold needs to be adjusted. For example, when the high load node is greater than 0.9, the number of nodes is 20, and the low load is less than There are only 10 nodes with 0.2. In this scenario, the low load threshold needs to be about 0.35, so that the high load nodes can reduce the load pressure as much as possible, and at the same time, they can perform migration in parallel and perform high and low load one-to-one node matching data transmission. ;
如果高负载队列节点数目远小于低负载节点数目,即S y<D m,则适当降低高负载阈值,例如之前设置高负载阈值为0.9,可以适当降低,降到0.75左右,使高负载节点队列中的节点数目等于或近小于低负载节点队列中的节点数目,接着按照公式1-16设定迁移的分区数; If the number of high-load queue nodes is much smaller than the number of low-load nodes, that is, Sy <D m , lower the high-load threshold appropriately. For example, the high-load threshold was set to 0.9 before, which can be appropriately reduced to about 0.75 to make the high-load node queue The number of nodes in is equal to or nearly less than the number of nodes in the low-load node queue, and then the number of migrated partitions is set according to formula 1-16;
获得源机应迁移的分区数后,即能够进行数据迁移。After obtaining the number of partitions that the source machine should migrate, the data can be migrated.
本发明系统涉及如下核心内容:The system of the present invention involves the following core contents:
(1)负载监测模块(1) Load monitoring module
可通过部署集群服务器的方式进行资源监测的,主要监测的指标是CPU、内存及带宽的利用率,通过实时性集群资源监控界面来为采集模块做铺垫;结合负载预测和指标权重判定方法能够判定是否为高低负载节点,为数据迁移模块做铺垫。Resource monitoring can be performed by deploying cluster servers. The main monitoring indicators are the utilization of CPU, memory and bandwidth. The real-time cluster resource monitoring interface is used to pave the way for the acquisition module; it can be determined by combining load prediction and index weight determination methods. Whether it is a high or low load node, pave the way for the data migration module.
(2)采集模块(2) Acquisition module
1)负载信息指标的选取1) Selection of load information indicators
节点中可以描述节点负载情况的关键资源有很多,例如CPU利用率、CPU上下文切换速率、空余硬盘大小、内存利用率、带宽使用率以及I/O资源等。本发明基于负载的分区策略主要使用CPU、内存及带宽利用率来表示节点的整体负载值。There are many key resources in the node that can describe the load of the node, such as CPU utilization, CPU context switching rate, free hard disk size, memory utilization, bandwidth utilization, and I/O resources. The load-based partition strategy of the present invention mainly uses CPU, memory, and bandwidth utilization to represent the overall load value of the node.
2)采集周期2) Collection cycle
采集模块每隔一定时间收集所有节点的负载信息,采集周期时间过短,会加重中央节点的负 载,也要消耗一定的带宽,会影响分布式系统的性能;采集周期时间过长会使用过时数据,不具有实时性效果,有可能会导致在数据分区时做出错误的分区决定,同时遇到突发状况,会出现亟待均衡的节点没有及时处理,反而处理了不需均衡的节点的情况。为了及时、更准确地采集节点的负载信息,可采用部署集群资源监控的方式进行资源信息的采集保存到缓存数组中,并将历史资源信息持久化保存到数据库中。目前大多数论文采用的时间间隔为5s到15s之间,可根据用户设定的周期进行采集。The collection module collects the load information of all nodes at regular intervals. If the collection cycle time is too short, it will increase the load of the central node and consume a certain amount of bandwidth, which will affect the performance of the distributed system. If the collection cycle time is too long, outdated data will be used. , Does not have real-time effects, and may cause wrong partitioning decisions during data partitioning. At the same time, when encountering emergencies, nodes that need to be balanced are not processed in time, but nodes that do not need to be balanced are processed. In order to collect the load information of nodes in a timely and more accurate manner, the method of deploying cluster resource monitoring can be used to collect resource information and save it in the cache array, and save the historical resource information to the database persistently. At present, the time interval used in most papers is between 5s and 15s, which can be collected according to the period set by the user.
(3)数据预分区模块(3) Data pre-partitioning module
1)负载预测1) Load forecast
使用预测模块预测节点未来时刻的负载情况,从而决定数据量的分配。经过相关人员的研究得出的结论:主机负载的变化具有自相似性、长期依赖性,对于这样特性的负载能够使用预测机制进行预测,确定在数据分配时刻的节点真实整体趋势的负载情况,从而更加有效地进行数据的分区,防止错误数据分区决策的情况发生。Use the prediction module to predict the load situation of the node in the future, so as to determine the distribution of the amount of data. After the research of relevant personnel, it is concluded that the change of host load has self-similarity and long-term dependence. For the load with such characteristics, the prediction mechanism can be used to predict the load situation of the real overall trend of the node at the time of data distribution, thereby Partition the data more effectively to prevent the occurrence of incorrect data partitioning decisions.
2)指标权重判定方法2) Judgment method of index weight
本发明选取了CPU利用率CUR、内存利用率MUR和带宽利用率BUR来评判节点的负载大小,由于Spark-MemSql应用中存在CPU密集型、内存密集型、传输密集型及混合型,针对不同的应用场景,每种指标的权重很可能不同,因此需要求取每个指标的权重配比。这种负载模型公式存在一个问题,给哪个指标赋予的权重越大,则表示这个指标更影响负载的总值。比如集群中的两个节点的CUR、MUR、BUR分别为<0.9,0.2,0.2>和<0.4,0.6,0.5>,明显前面节点的CPU负载很大,已到瓶颈,后面节点的负载较平均,按照常理,该先对前面节点进行负载均衡,尽可能不再分配数据或者将部分数据迁移到其他较低负载的节点上。如三个指标的权重分布取值为w 1=0.1、w 2=0.5,w 3=0.4,按公式计算,第一个节点的负载为0.27,第二个节点的负载为0.54,通过比较负载值会选择后面节点进行负载均衡,可以印证权重值的不同将影响负载的综合判断结果。因此需要采用合理的指标判定方法确定每种指标的权重,从而确定每个节点的整体负载情况。 The present invention selects CPU utilization rate CUR, memory utilization rate MUR, and bandwidth utilization rate BUR to judge the load size of the node. Because there are CPU-intensive, memory-intensive, transmission-intensive and mixed types in Spark-MemSql applications, it is aimed at different In application scenarios, the weight of each indicator is likely to be different, so the weight ratio of each indicator is required. There is a problem with this load model formula. The greater the weight given to which indicator, the more the indicator affects the total value of the load. For example, the CUR, MUR, and BUR of the two nodes in the cluster are respectively <0.9,0.2,0.2> and <0.4,0.6,0.5>. Obviously, the CPU load of the previous node is very large, and it has reached the bottleneck, and the load of the latter node is relatively even. According to common sense, load balancing should be performed on the previous nodes first, and data should not be distributed as much as possible or part of the data should be migrated to other nodes with lower load. For example, the weight distribution of the three indicators is w 1 =0.1, w 2 =0.5, and w 3 =0.4. According to the formula, the load of the first node is 0.27, and the load of the second node is 0.54. By comparing the load The value will select the following nodes for load balancing, which can confirm that the difference in the weight value will affect the comprehensive judgment result of the load. Therefore, it is necessary to adopt a reasonable index judgment method to determine the weight of each index, so as to determine the overall load of each node.
3)节点的数据分布3) Data distribution of nodes
由于分析的数据量较大,为了减少网络传输,采用本地化读取数据分析的方式。在数据预分区阶段,MemSql默认的分区方式是每个节点的分区数目相同,这会因为集群节点的异构性导致节点处理能力的不同而引起节点之间的数据倾斜现象,引发集群负载不均衡的问题,为了尽可能利用本地化资源,提高分布式系统并行计算效率,减少网络传输,就需要考虑每个节点的整体负载情况,即按照每个节点处理任务的能力来进行数据的有效分区。Due to the large amount of data analyzed, in order to reduce network transmission, the method of localized reading data analysis is adopted. In the data pre-partitioning phase, MemSql's default partitioning method is that the number of partitions for each node is the same. This will cause data skew between nodes due to the heterogeneity of cluster nodes resulting in different processing capabilities of the nodes, leading to unbalanced cluster load In order to use localized resources as much as possible, improve the parallel computing efficiency of distributed systems, and reduce network transmission, it is necessary to consider the overall load of each node, that is, to effectively partition the data according to the ability of each node to process tasks.
(4)数据迁移模块(4) Data migration module
因为某些应用执行完或者预分区不合理情况导致集群负载不均衡现象,需要解决如下问题:1)何种状况下需要数据迁移来达到负载均衡,即数据迁移的触发条件;2)迁移哪个节点上的数据,即源机的选择;3)将数据迁移到哪个节点,即目标机的选择;4)迁移的数据量,即迁移的分区数。Because some applications are executed or unreasonable pre-partitioning conditions lead to unbalanced cluster load, the following problems need to be solved: 1) Under what conditions data migration is required to achieve load balance, that is, the trigger condition of data migration; 2) Which node to migrate The above data refers to the selection of the source machine; 3) The node to which the data is migrated, namely the selection of the target machine; 4) The amount of data to be migrated, that is, the number of partitions to be migrated.
有益效果:本发明提供了一种主客观权重集成法,能够巧妙地将集群中每个节点的计算能力定量化,充分利用了每个节点的计算资源,从而提高了大数据分析应用整体的响应速度;当集群出现负载不均衡问题,本发明提供了一种动态负载均衡策略,能够较灵活地保证分布式集群资源利用的稳定性;在相对独立内存不断迭代的可并行化执行应用中,例如关联分析、聚类、神经网络等机器学习算法,在本公司舆情分析系统中事件评论的聚类分析及相关人物关联分析等模块中已经广泛使用,最终加快了应用的响应速度。Beneficial effects: The present invention provides a subjective and objective weight integration method that can subtly quantify the computing power of each node in the cluster, fully utilize the computing resources of each node, and improve the overall response of big data analysis applications Speed; when the cluster has a load imbalance problem, the present invention provides a dynamic load balancing strategy, which can more flexibly ensure the stability of the resource utilization of the distributed cluster; in relatively independent memory and iterative parallel execution applications, such as Association analysis, clustering, neural network and other machine learning algorithms have been widely used in the cluster analysis of event comments in the company’s public opinion analysis system and related person association analysis modules, which ultimately speeds up the response speed of the application.
附图说明Description of the drawings
下面结合附图和具体实施方式对本发明做更进一步的具体说明,本发明和其他方面的优点将会变得更加清楚。In the following, the present invention will be further described in detail with reference to the accompanying drawings and specific embodiments, and the advantages of the present invention and other aspects will become clearer.
图1是基于节点负载的数据动态分区机制流程图;Figure 1 is a flow chart of the dynamic data partition mechanism based on node load;
图2是集群资源监控界面图;Figure 2 is a diagram of the cluster resource monitoring interface;
图3是预测机制流程图;Figure 3 is a flow chart of the prediction mechanism;
图4是AHP层次分析法计算指标权重流程图;Figure 4 is a flowchart of calculating index weights by AHP analytic hierarchy process;
图5是熵值法计算指标权重流程图;Figure 5 is a flowchart of calculating index weights by entropy method;
图6是Spark和MemSql集成图。Figure 6 is the integration diagram of Spark and MemSql.
图7是关联分析应用的CPU利用率预测对比图。Figure 7 is a comparison diagram of the CPU utilization prediction of the correlation analysis application.
图8是Kmeans聚类分析应用的CPU利用率预测对比图。Figure 8 is a comparison diagram of the CPU utilization prediction of the Kmeans clustering analysis application.
图9是关联分析应用预分区策略性能对比图。Figure 9 is a performance comparison chart of the correlation analysis application pre-partitioning strategy.
图10是Kmeans聚类分析应用预分区策略性能对比图。Figure 10 is a comparison diagram of the performance of the Kmeans clustering analysis application pre-partitioning strategy.
图11是关联分析应用的不同预分区策略节点负载利用率对比图。Figure 11 is a comparison diagram of node load utilization of different pre-partitioning strategies applied by correlation analysis.
图12是Kmeans聚类分析应用的不同预分区策略节点负载利用率对比图。Figure 12 is a comparison diagram of node load utilization of different pre-partitioning strategies applied by Kmeans clustering analysis.
图13是关联分析迁移策略性能对比图。Figure 13 is a performance comparison diagram of the correlation analysis migration strategy.
图14是Kmean聚类分析迁移策略性能对比图。Figure 14 is a comparison chart of Kmean clustering analysis migration strategy performance.
图15是关联分析数据迁移前后节点负载利用率平均值对比图。Figure 15 is a comparison diagram of the average node load utilization before and after the correlation analysis data is migrated.
图16是Kmeans聚类分析数据迁移前后节点负载利用率平均值对比图。Figure 16 is a comparison chart of average node load utilization before and after data migration of Kmeans cluster analysis.
具体实施方式Detailed ways
为了实现系统动态的负载均衡性及提高应用的响应速度,提出了一种基于节点负载的数据动态分区系统,如图1所示,该系统包括负载监测模块、采集模块、数据预分区模块和数据迁移模块。整个Spark-MemSql集成集群一直处于应用使用中,负载监测模块中主节点定时读取从节点中各个指标的负载信息,并在监测界面中动态显示CPU、内存和带宽的利用率等信息;然后通过采集模块将负载信息保存到缓存数组中,并定期持久化到Mysql数据库中,为负载预测提供指标负载信息;接着在有大量的新数据导入时,需要数据预分区模块中的预测模块对每个节点的每种指标进行预测,然后通过指标权重判定方法获得每种指标的权重,依据负载预测后的指标信息与每种指标的权重值来获得每个节点的处理能力,再根据每个节点的处理能力来进行数据分布,完成数据预分区;如果集群在应用过程中出现负载不均衡问题,达到了设定负载阈值,则将高低负载节点加入到源、目标机队列中,并根据迁移策略进行分区块的迁移。迁移之后如果再遇到负载不均衡问题,同样采用上述过程进行分区块的动态迁移。In order to achieve the dynamic load balance of the system and improve the response speed of the application, a data dynamic partition system based on node load is proposed. As shown in Figure 1, the system includes a load monitoring module, a collection module, a data pre-partitioning module and data Migrate the module. The entire Spark-MemSql integrated cluster has been in application use. The master node in the load monitoring module regularly reads the load information of each indicator in the slave node, and dynamically displays the utilization of CPU, memory, and bandwidth in the monitoring interface; then through The collection module saves the load information in the cache array, and periodically persists it in the Mysql database to provide indicator load information for load prediction; then when a large amount of new data is imported, the prediction module in the data pre-partitioning module is required for each Each indicator of the node is predicted, and then the weight of each indicator is obtained through the indicator weight determination method. The processing capacity of each node is obtained according to the indicator information after the load prediction and the weight value of each indicator, and then the processing capacity of each node is obtained according to the Processing capacity is used to distribute data and complete data pre-partitioning; if the cluster has load imbalance during the application process and reaches the set load threshold, add high and low load nodes to the source and target machine queues, and proceed according to the migration strategy Migration in blocks. After the migration, if you encounter load imbalance again, the above process is also used for dynamic migration of blocks.
(1)监测模块(1) Monitoring module
可通过部署集群服务器的方式进行资源监测的,如图2所示,主要监测的指标是CPU、内存及带宽的利用率,通过实时性集群资源监控界面来为采集模块做铺垫;结合负载预测和指标权重判定方法能够判定是否为高低负载节点,为数据迁移模块做铺垫。Resource monitoring can be performed by deploying cluster servers. As shown in Figure 2, the main monitoring indicators are the utilization of CPU, memory, and bandwidth. The real-time cluster resource monitoring interface is used to pave the way for the acquisition module; combined with load forecasting and The index weight determination method can determine whether it is a high or low load node, paving the way for the data migration module.
(2)采集模块(2) Acquisition module
1)负载信息指标的选取1) Selection of load information indicators
节点中可以描述节点负载情况的关键资源有很多,例如CPU利用率、CPU上下文切换速率、空余硬盘大小、内存利用率、带宽使用率以及I/O资源等。本发明基于负载的分区策略主要使用CPU、内存及带宽利用率来表示节点的整体负载值。There are many key resources in the node that can describe the load of the node, such as CPU utilization, CPU context switching rate, free hard disk size, memory utilization, bandwidth utilization, and I/O resources. The load-based partition strategy of the present invention mainly uses CPU, memory, and bandwidth utilization to represent the overall load value of the node.
2)采集周期2) Collection cycle
采集模块每隔一定时间收集所有节点的负载信息,采集周期时间过短,会加重中央节点的负载,也要消耗一定的带宽,会影响分布式系统的性能;采集周期时间过长会使用过时数据,不具有实时性效果,有可能会导致在数据分区时做出错误的分区决定,同时遇到突发状况,会出现亟待均衡的节点没有及时处理,反而处理了不需均衡的节点的情况。为了及时、更准确地采集节点的负载信息,可采用Yarn资源管理组件提供的API方式进行资源信息的采集并保存到缓存数组中,并将历史资源信息持久化保存到数据库中。目前大多数论文采用的时间间隔为5s到15s之间,可根据用户设 定的周期进行采集。The collection module collects the load information of all nodes at regular intervals. If the collection cycle time is too short, it will increase the load of the central node and consume a certain amount of bandwidth, which will affect the performance of the distributed system. If the collection cycle time is too long, outdated data will be used. , Does not have real-time effects, and may cause wrong partitioning decisions during data partitioning. At the same time, when encountering emergencies, nodes that need to be balanced are not processed in time, but nodes that do not need to be balanced are processed. In order to collect node load information in a timely and more accurate manner, the API provided by the Yarn resource management component can be used to collect resource information and save it in the cache array, and save historical resource information to the database persistently. At present, the time interval used in most papers is between 5s and 15s, which can be collected according to the period set by the user.
(3)数据预分区模块(3) Data pre-partitioning module
1)负载预测1) Load forecast
传统的数据分布策略只根据节点当前实时负载信息来作为数据分区的判断依据。假设节点负载存在瞬时高低峰值的情况,之后恢复正常。如果使用传统的数据分区策略,这个峰值必然影响最终数据分布决策的拟定,极易导致数据分配不均衡的情况,引发系统不必要的开销,因此必须防止负载瞬时高低峰值造成的错误数据预分区决策的情况发生。如果数据已经分配好,但出现一些突发情况,例如:因节点的宕机删除节点、增加节点进行水平扩展及出现负载极其不均衡问题等,都需要进行分区块的迁移来平衡负载量,依然需要参考负载预测模块来决定迁移量。The traditional data distribution strategy is only based on the current real-time load information of the node as the basis for judging the data partition. Suppose there is an instantaneous high and low peak in the node load, and then it returns to normal. If the traditional data partitioning strategy is used, this peak value will inevitably affect the final data distribution decision making, which can easily lead to unbalanced data distribution and cause unnecessary system overhead. Therefore, it is necessary to prevent incorrect data pre-partitioning decisions caused by instantaneous load peaks. The situation happened. If the data has been allocated, but some unexpected situations occur, such as deleting nodes due to node downtime, adding nodes for horizontal expansion, and extremely unbalanced load problems, etc., it is necessary to perform block migration to balance the load. Need to refer to the load prediction module to determine the amount of migration.
1、二次指数平滑法1. Quadratic exponential smoothing method
二次指数平滑法是在一次指数平滑法的基础上再做一次指数平滑的方法,不能单独进行预测,与一次指数平滑法结合建立的数学模型,可以利用此模型确定下一时刻预测值。目前,大多数预测模型都选择二次指数平滑法。因为一次指数平滑法和平均负载法比较适合一种水平发展趋势的时间序列分析,如果碰到实际值上升或下降趋势时,预测值和实际值偏差会比较大,存在明显的滞后现象。针对Spark-MemSql集成框架的分析应用会产生负载上升或下降的情况发生,而二次指数平滑法能够更好地解决这种应用场景下的问题,它能够利用滞后偏差的规律找出值变化的发展趋势。因此本发明采用二次指数平滑法模型进行负载预测。The second exponential smoothing method is a method of exponential smoothing on the basis of the first exponential smoothing method. It cannot be predicted separately. The mathematical model established by combining with the first exponential smoothing method can use this model to determine the predicted value at the next moment. At present, most forecasting models choose the quadratic exponential smoothing method. Because the one-time exponential smoothing method and the average load method are more suitable for a time series analysis of a horizontal development trend, if the actual value is rising or falling, the deviation between the predicted value and the actual value will be relatively large, and there will be an obvious lag. The analysis application for the Spark-MemSql integrated framework will produce load rise or fall, and the quadratic exponential smoothing method can better solve the problem in this application scenario. It can use the law of lag deviation to find out the value change development trend. Therefore, the present invention adopts a quadratic exponential smoothing method model for load prediction.
一次指数平滑法公式如下:The formula of the one-time exponential smoothing method is as follows:
Figure PCTCN2020090554-appb-000025
Figure PCTCN2020090554-appb-000025
二次指数平滑法公式如下:The formula of the quadratic exponential smoothing method is as follows:
Figure PCTCN2020090554-appb-000026
Figure PCTCN2020090554-appb-000026
综合一、二次指数平滑公式,可得出第T个周期的负载预测值,公式如下:Combining the first and second exponential smoothing formulas, the load forecast value of the T-th period can be obtained, the formula is as follows:
Figure PCTCN2020090554-appb-000027
Figure PCTCN2020090554-appb-000027
其中,Y j是第j个周期的实际值,
Figure PCTCN2020090554-appb-000028
Figure PCTCN2020090554-appb-000029
分别是第j-1个周期的预测值和第j个周期的预测值,
Figure PCTCN2020090554-appb-000030
Figure PCTCN2020090554-appb-000031
分别为第j-1个周期的二次指数平滑值和第j个周期的二次指数平滑值,
Figure PCTCN2020090554-appb-000032
是第j+T个周期的预测值;a j和b j是中间参数;α是平滑系数,α∈[0,1]。预测值受到平滑系数α的影响很大,α值越小,历史数据影响越大;α值越大,近期数据影响越大。通常来说,对于数据波动小的情况,需减少最近数据对预测结果的影响,α值应取小一些;对于数据波动较大的情况,需加大近期数据对预测结果的影响,α值应取大一些。
Among them, Y j is the actual value of the j-th period,
Figure PCTCN2020090554-appb-000028
with
Figure PCTCN2020090554-appb-000029
Are the predicted value of the j-1 cycle and the predicted value of the j cycle,
Figure PCTCN2020090554-appb-000030
with
Figure PCTCN2020090554-appb-000031
Are the second exponential smoothing value of the j-1 period and the second exponential smoothing value of the j period,
Figure PCTCN2020090554-appb-000032
Is the predicted value of the j+T period; a j and b j are intermediate parameters; α is the smoothing coefficient, α∈[0,1]. The predicted value is greatly affected by the smoothing coefficient α. The smaller the value of α, the greater the influence of historical data; the larger the value of α, the greater the influence of recent data. Generally speaking, for small data fluctuations, it is necessary to reduce the impact of recent data on the prediction results, and the α value should be smaller; for large data fluctuations, the impact of recent data on the prediction results needs to be increased, and the α value should be Take the bigger one.
通常,数据波动小的情况,α应选择较小的值,如0.05-0.15;数据有波动但长期波动不大的情况,α应选择稍大的值,如0.1-0.5;数据波动大且长期也大,α应选择较大的值,如0.6-0.8;数据明显上升或下降的趋势,α应选择较大的值,如0.6-1。Generally, when the data fluctuates little, a smaller value should be selected for α, such as 0.05-0.15; when the data fluctuates but the long-term fluctuation is not large, α should be selected a slightly larger value, such as 0.1-0.5; the data fluctuates greatly and long-term Also large, a larger value should be selected for α, such as 0.6-0.8; when the data is obviously rising or falling, a larger value should be selected for α, such as 0.6-1.
2、通过采集模块将历史负载信息保存到Mysql数据库中。当分析的数据预分区到MemSql集群中时,首先需要把集群中所有节点的前n-1个周期采集的数据作为负载数据参数传给预测模块,与当前的负载组成大小为n的负载数据集,取第一次测量的实际值作为初值Y j、一次预测初值及二次预测初值。使用这n个周期数据预测未来d个周期节点负载值,然后取未来d个周期节点负载的平均值,最终确定集群中每个节点的负载信息,为数据分区策略模块提供集群节点的未来负载信息,从而为数据分区策略模块提供决策依据。同理,如果遇到突发情况引起集群负载不均衡,则将d个周期节点整体负载的平均值与阈值比较,如果平均值大于阈值,则触发数据迁移操作。在本策略中,如果某个节点的未来d个周期负载平均值高于高阈值或低于低阈值,则更新高低负载队列。平滑系数根据j个预测数据与真实数据进行标准偏差S,当S值最小时的平滑系数α对应的值为最终平滑系数标准。标准偏差S公式如下: 2. Save the historical load information to the Mysql database through the acquisition module. When the analyzed data is pre-partitioned to the MemSql cluster, the data collected in the first n-1 cycles of all nodes in the cluster must be sent to the prediction module as the load data parameter, and the current load forms a load data set of size n. , Take the actual value of the first measurement as the initial value Y j , the initial value of the first prediction and the initial value of the second prediction. Use these n period data to predict the node load value in the future d periods, and then take the average value of the node load in the future d periods, and finally determine the load information of each node in the cluster, and provide the data partition strategy module with the future load information of the cluster nodes , So as to provide decision-making basis for the data partition strategy module. In the same way, if an unexpected situation causes cluster load imbalance, the average value of the overall load of the d period nodes is compared with the threshold, and if the average value is greater than the threshold, the data migration operation is triggered. In this strategy, if the average load of a node in the future d cycles is higher than the high threshold or lower than the low threshold, the high and low load queues are updated. The smoothing coefficient is based on the standard deviation S of the j predicted data and the real data, and the corresponding value of the smoothing coefficient α when the S value is the smallest is the final smoothing coefficient standard. The formula for standard deviation S is as follows:
Figure PCTCN2020090554-appb-000033
Figure PCTCN2020090554-appb-000033
其中,n代表取的周期数,j代表第j个周期。预测机制的流程如图3所示,通过调整平滑系数α值来计算偏方差S,取S最小时对应的平滑系数α值。n、d的值由用户设定。Among them, n represents the number of cycles taken, and j represents the j-th cycle. The flow of the prediction mechanism is shown in Figure 3. The partial variance S is calculated by adjusting the value of the smoothing coefficient α, and the value of the corresponding smoothing coefficient α is taken when S is the smallest. The values of n and d are set by the user.
2)指标权重判定方法2) Judgment method of index weight
因为Spark-MemSql集成框架环境下的应用场景,CPU和带宽变化波动较大,内存变化波动较小,如果只考虑主观AHP权重法会忽略某些指标的重要性,只考虑客观熵值法则会影响内存的权重判定。因此,本发明通过基于二次平滑负载预测法+主客观AHP与熵值指标权重集成法结合的指标权重判定方法算出每个节点的整体负载值,最终再根据整体负载值来分配相应的数据量。Because of the application scenarios in the Spark-MemSql integrated framework environment, the CPU and bandwidth fluctuate greatly, and the memory fluctuates less. If only the subjective AHP weighting method is considered, the importance of certain indicators will be ignored, and only the objective entropy method will be affected. The weight of memory is judged. Therefore, the present invention calculates the overall load value of each node through the index weight determination method based on the combination of the secondary smooth load prediction method + subjective and objective AHP and the entropy index weight integration method, and finally allocates the corresponding data amount according to the overall load value .
1、AHP1. AHP
AHP主观权重方法的主要思想:多属性决策中,由决策者对所有评价指标进行两两比较,得到判断矩阵U=(A ij) n×n,其中A ij为评价指标A i与A j比较而得的数值,取值为1至9之间的奇数,分别表示前者指标比后者指标同等重要、较重要、很重要、非常重要、极其重要;当取值为1至9之间的偶数时,分别表示两两相比的重要程度介于两个相邻奇数所表示重要性程度之间,且
Figure PCTCN2020090554-appb-000034
AHP主观计算指标权重法流程如图4所示。
The main idea of the AHP subjective weight method: In multi-attribute decision-making, the decision maker compares all evaluation indicators pairwise to obtain the judgment matrix U=(A ij ) n×n , where A ij is the comparison of the evaluation index A i and A j The value obtained is an odd number between 1 and 9, indicating that the former index is equally important, more important, very important, very important, and extremely important than the latter index; when the value is an even number between 1 and 9 When, respectively, the importance of the pairwise comparison is between the importance degrees of two adjacent odd numbers, and
Figure PCTCN2020090554-appb-000034
The process of AHP subjective calculation index weight method is shown in Figure 4.
1)对CPU利用率、内存利用率和带宽利用率两两比较,得到判断矩阵A:1) Compare the CPU utilization rate, memory utilization rate and bandwidth utilization rate in pairs to obtain the judgment matrix A:
Figure PCTCN2020090554-appb-000035
Figure PCTCN2020090554-appb-000035
其中,A 1,A 2,A 3分别代表一个节点的CPU利用率对节点整体负载影响的权重值、内存利用率对节点整体负载影响的权重值和带宽利用率对节点整体负载影响的权重值,对判断矩阵A每列进行归一化操作,求取列特征向量,再对每行进行归一化操作,求取行特征向量,最后得出每种指标的权重配比,并对判断矩阵A进行一致性检验,最终得到一个节点的CPU、内存和带宽的主观权重分别为WS 1,WS 2,WS 3,并且WS 1+WS 2+WS 3=1; Among them, A 1 , A 2 , and A 3 represent the weight value of the impact of a node's CPU utilization on the overall load of the node, the weight value of the impact of memory utilization on the overall load of the node, and the weight of the impact of bandwidth utilization on the overall load of the node. , Perform normalization operation on each column of the judgment matrix A, obtain the column eigenvectors, and then normalize each row, obtain the row eigenvectors, and finally obtain the weight ratio of each indicator, and compare the judgment matrix A performs a consistency check, and finally obtains the subjective weights of a node's CPU, memory, and bandwidth as WS 1 , WS 2 , WS 3 , and WS 1 +WS 2 +WS 3 =1;
2)计算矩阵的特征向量和指标权重2) Calculate the eigenvectors and index weights of the matrix
①对矩阵各列求和,列和的向量为:SUM j① Sum the columns of the matrix, the vector of the column sum is: SUM j .
②对每一列进行归一化处理,公式如下:② Normalize each column, the formula is as follows:
Figure PCTCN2020090554-appb-000036
Figure PCTCN2020090554-appb-000036
∑A ij的值为各列的和SUM j,得到新矩阵B,B矩阵中每一列值的和都为1。 The value of ΣA ij is the sum of each column SUM j , and a new matrix B is obtained. The sum of the values of each column in the B matrix is 1.
③对每一行求和,即得出特征向量SUM i③ Sum up each row to get the feature vector SUM i .
④计算指标权重,对特征向量进行归一化处理,公式如下:④ Calculate the index weight and normalize the feature vector, the formula is as follows:
Figure PCTCN2020090554-appb-000037
Figure PCTCN2020090554-appb-000037
即可得到三种查询模式指标权重分别为W 1,W 2,W 3The index weights of the three query modes are W 1 , W 2 , W 3 .
3)矩阵一致性检验3) Matrix consistency test
为了检验得出指标权重是否正确,需要对指标进行比较,例如:如果A>B,B>C,那么必须得出A>C,反之,则一致性不成立。所以需要对矩阵的一致性进行检验,确保没有出现以上的错误。In order to test whether the index weight is correct, it is necessary to compare the indexes. For example, if A>B, B>C, then A>C must be obtained, otherwise, the consistency is not established. Therefore, it is necessary to check the consistency of the matrix to ensure that the above errors do not occur.
①计算矩阵的最大特征根,公式如下:①Calculate the largest eigenvalue of the matrix, the formula is as follows:
Figure PCTCN2020090554-appb-000038
Figure PCTCN2020090554-appb-000038
其中,λ max为最大特征根,AW表示矩阵A和权重向量W相乘,得到一个列向量,n代表矩阵阶数,W代表权重向量。 Among them, λ max is the maximum characteristic root, AW represents the matrix A and the weight vector W are multiplied to obtain a column vector, n represents the order of the matrix, and W represents the weight vector.
②计算判断矩阵的一致性(Constant index)指标,公式如下:②Calculate the constant index of the judgment matrix, the formula is as follows:
Figure PCTCN2020090554-appb-000039
Figure PCTCN2020090554-appb-000039
其中,C.I.代表一致性指标,n表示矩阵的阶数。Among them, C.I. represents the consistency index, and n represents the order of the matrix.
③计算随机一致性比率,计算公式如下:③Calculate the random consistency ratio, the calculation formula is as follows:
Figure PCTCN2020090554-appb-000040
Figure PCTCN2020090554-appb-000040
其中,R.I.代表平均随机一致性指标,是一个常量,根据阶数可以在量表里查询。4阶R.I.=0.89,如果C.R.<0.1,说明对比矩阵保持一致性。如果C.R.>0.1,则表示对比矩阵不具有一致性,需要进行调整。Among them, R.I. represents the average random consistency index, which is a constant, which can be queried in the scale according to the order. The fourth-order R.I.=0.89, if C.R.<0.1, it means that the contrast matrix remains consistent. If C.R.>0.1, it means that the contrast matrix is not consistent and needs to be adjusted.
2、熵值法2. Entropy method
主要思想:熵值法是一种通过判断某个指标的离散度来反映该指标对综合评价的影响程度的数学方法,能够通过指标值的变异度客观地确定权重。指标的权重与变异度呈正相关关系,即指标值的变异程度越大,其权重越大;反之,指标值的变异程度越小,其权重越小。熵值法计算指标权重流程如图5所示。Main idea: Entropy method is a mathematical method that reflects the degree of influence of an index on comprehensive evaluation by judging the dispersion of an index, and can objectively determine the weight through the degree of variation of the index value. The weight of the index is positively correlated with the degree of variability, that is, the greater the degree of variation of the index value, the greater its weight; conversely, the smaller the degree of variation of the index value, the smaller its weight. The process of calculating the index weight by the entropy method is shown in Figure 5.
具体步骤如下:Specific steps are as follows:
(1)构建负载信息决策矩阵M:(1) Construct load information decision matrix M:
Figure PCTCN2020090554-appb-000041
Figure PCTCN2020090554-appb-000041
其中,n代表周期数,CUR、MUR和BUR分别代表CPU、内存和带宽的利用率。Among them, n represents the number of cycles, and CUR, MUR, and BUR represent the utilization of CPU, memory, and bandwidth, respectively.
(2)对决策矩阵M每列进行标准化处理得到决策R:(2) Standardize each column of the decision matrix M to obtain the decision R:
Figure PCTCN2020090554-appb-000042
Figure PCTCN2020090554-appb-000042
其中
Figure PCTCN2020090554-appb-000043
矩阵R每一列满足归一性,即
Figure PCTCN2020090554-appb-000044
j=1,2,3即每一列值的和为1。
among them
Figure PCTCN2020090554-appb-000043
Each column of the matrix R satisfies the normalization, namely
Figure PCTCN2020090554-appb-000044
j = 1, 2, 3, that is, the sum of the values in each column is 1.
(3)利用熵公式计算指标的不确定度:(3) Use the entropy formula to calculate the uncertainty of the index:
用E表示任一种负载信息指标的熵,公式如下:Use E to represent the entropy of any load information index, and the formula is as follows:
Figure PCTCN2020090554-appb-000045
Figure PCTCN2020090554-appb-000045
E j代表指标的熵值,常数K=1/ln(n),这样能保证0≤E≤1,即E最大为1。 E j represents the entropy value of the index, and the constant K=1/ln(n), so that 0≤E≤1 can be guaranteed, that is, the maximum E is 1.
由式中可以看出,当某个属性下各值的贡献度趋于一致时,E趋于1;例如当全相等时,也就可以不考虑该目标的属性在决策中的作用,即此时属性的权重为0。这样,可看出属性值由某个属性列值的差异大小来影响权系数的大小。为此可定义D j为某个指标的贡献度,D j=1-E jIt can be seen from the formula that when the contribution degree of each value under a certain attribute tends to be the same, E tends to 1; for example, when they are all equal, the role of the attribute of the target in decision-making can be ignored, that is, this The weight of the time attribute is 0. In this way, it can be seen that the attribute value is affected by the difference in the value of a certain attribute column. For this reason, D j can be defined as the contribution degree of a certain index, D j =1-E j .
(4)计算每种指标的客观权重值,公式如下:(4) Calculate the objective weight value of each indicator, the formula is as follows:
Figure PCTCN2020090554-appb-000046
Figure PCTCN2020090554-appb-000046
WO 1,WO 2,WO 3分别代表CPU对于节点负载影响的客观权重值、内存对于节点负载影响的客观权重值和带宽对于节点负载影响的客观权重值,并且WO 1+WO 2+WO 3=1。计算每种指标客观权重值,算法输入每种指标不同周期负载值矩阵,通过熵值法计算得到每种指标的客观权重值。 WO 1 , WO 2 , WO 3 respectively represent the objective weight value of CPU's impact on node load, the objective weight value of memory on node load, and the objective weight of bandwidth on node load, and WO 1 +WO 2 +WO 3 = 1. Calculate the objective weight value of each indicator, the algorithm inputs the matrix of load values of each indicator in different periods, and calculates the objective weight value of each indicator through the entropy method.
3、主客观AHP和熵值法权重集成法3. The weight integration method of subjective and objective AHP and entropy method
针对真实应用情况可能会分别出现主、客观指标权重设计的弊端问题,即某个指标在客观应用中占有很大比重,但是主观者却不清楚;又如某些指标利用率一直处在较高负载稳定的情况下,可 能是一直在使用而处于稳定状态或者未在使用的稳定状态的情况,若按照客观法很容易计算出该指标权重占比较小,与主观实际有所偏差。因此本发明设计主客观集成的方法来解决此类问题,平衡两者的权重偏差。集成权重公式如下:For real application situations, there may be disadvantages of subjective and objective indicator weight design, that is, a certain indicator occupies a large proportion in objective applications, but the subjective person is not clear; another example is that the utilization rate of certain indicators has been high. When the load is stable, it may be in a stable state or not in a stable state in use all the time. According to the objective method, it is easy to calculate that the weight of this indicator is relatively small and deviates from the subjective reality. Therefore, the present invention designs a subjective and objective integrated method to solve such problems and balance the weight deviation between the two. The integrated weight formula is as follows:
w i=β×WS i+(1-β)×WO i,       (1-12) w i =β×WS i +(1-β)×WO i , (1-12)
其中β为主客观权重调整系数,w i为最终节点负载的权重,其中i=1,2,3,并且w 1+w 2+w 3=1。 Where β is the subjective and objective weight adjustment coefficient, w i is the weight of the final node load, where i=1, 2, 3, and w 1 + w 2 + w 3 =1.
节点数据分布:首先,由前面模块得到了CPU、内存、带宽三种指标在负载中所占的主客观集成权重大小后,分别为w 1,w 2,w 3Node data distribution: First, the subjective and objective integration weights of the three indicators of CPU, memory, and bandwidth in the load are obtained from the previous module, which are w 1 , w 2 , and w 3 respectively .
然后,通过每种指标的权重来获得每个节点的处理能力,公式如下:Then, the processing capacity of each node is obtained by the weight of each indicator, the formula is as follows:
CA i=w 1×(1-CAU i)+w 2×(1-MAU i)+w 3×(1-BAU i),   (1-13) CA i = w 1 ×(1-CAU i )+w 2 ×(1-MAU i )+w 3 ×(1-BAU i ), (1-13)
其中,CAU i、MAU i、BAU i分别代表预测后的CPU、内存、带宽利用率,i代表第i节点。 Among them, CAU i , MAU i , and BAU i represent the predicted utilization of CPU, memory, and bandwidth respectively, and i represents the i-th node.
最后,得出每个节点要分配的数据量的占比,公式如下:Finally, get the proportion of the amount of data to be allocated by each node, the formula is as follows:
Figure PCTCN2020090554-appb-000047
Figure PCTCN2020090554-appb-000047
其中DP i代表第i个节点应分配的数据量占比,m表示节点总数。 Among them, DP i represents the proportion of the amount of data that should be allocated to the i-th node, and m represents the total number of nodes.
通过以上步骤后可知给集群中每个节点分配的数据量,即相应的分区数。After the above steps, we can know the amount of data allocated to each node in the cluster, that is, the corresponding number of partitions.
(4)数据迁移模块(4) Data migration module
通过设置高低负载阈值来作为触发数据迁移的条件,构造出源机和目标机的选择队列。在数据预分区之后出现负载不均衡问题或者增删节点的情况,需要选择源机和目标机来进行数据迁移,源机作为待迁移数据的节点,目标机作为接受迁移数据的节点,并获得应迁移的分区数。By setting high and low load thresholds as a condition for triggering data migration, a selection queue of source and target machines is constructed. After the data is pre-partitioned, load imbalance or addition or deletion of nodes occurs, you need to select the source and target machines for data migration. The source machine is the node of the data to be migrated, and the target machine is the node that accepts the data to be migrated. The number of partitions.
1)源机选择1) Source machine selection
首先,从负载缓存数组中读取CPU利用率、内存利用率和带宽利用率负载信息进行预测,预测T个周期后的每种指标平均负载值。First, read the CPU utilization, memory utilization, and bandwidth utilization load information from the load cache array to predict, and predict the average load value of each indicator after T cycles.
然后,将每种指标的负载利用率预测值与主客观权重集成方法得到每种指标的负载权重值结合,进而得到每个节点的整体负载值Load i。负载值公式如下: Then, the load utilization prediction value of each indicator is combined with the subjective and objective weight integration method to obtain the load weight value of each indicator, and then the overall load value Load i of each node is obtained. The load value formula is as follows:
Load i=w 1×CUR i+w 2×MUR i+w 3×BUR i,    (1-15) Load i = w 1 ×CUR i +w 2 ×MUR i +w 3 ×BUR i , (1-15)
其中,CUR i、MUR i、BUR i和w 1,w 2,w 3分别为预测后的CPU利用率、内存利用率、带宽的利用率和权重值。 Among them, CUR i , MUR i , BUR i and w 1 , w 2 , w 3 are the predicted CPU utilization, memory utilization, bandwidth utilization and weight values, respectively.
接着,将每个节点的负载值Load i与设置的阈值进行比较,如果某个节点的负载值超过H th阈值,则将该节点加入到高负载节点队列中。 Then, the load value Load i of each node is compared with the set threshold, and if the load value of a certain node exceeds the H th threshold, the node is added to the high-load node queue.
然后,按照整体负载值由大到小构成源机选择队列S y={s 1,s 2,……,s m}。 Then, the source machine selection queue Sy = {s 1 , s 2 ,..., s m } is formed according to the overall load value from large to small.
最后,从S y队列中选取源机。对S y队列中的每个节点的负载值按降序进行排列,按整体负载值从大到小的顺序进行源机的选择。 Finally, select the source machine from the Sy queue. The load value of each node in the Sy queue is sorted in descending order, and the source machine is selected in descending order of the overall load value.
2)目标机选择2) Target machine selection
首先,从负载缓存数组中读取CPU利用率、内存利用率和带宽利用率负载信息进行预测,分别预测T个周期后每种指标的平均负载值。First, read the CPU utilization, memory utilization, and bandwidth utilization load information from the load cache array to predict, and respectively predict the average load value of each indicator after T cycles.
然后,将每种指标的负载利用率预测值与主客观权重集成方法得到每种指标的负载权重值结合,代入公式1-15计算,进而得到每个节点的整体负载值Load iThen, combine the load utilization prediction value of each indicator with the subjective and objective weight integration method to obtain the load weight value of each indicator, and substitute it into formula 1-15 to calculate, and then obtain the overall load value Load i of each node.
接着,将每个节点的负载值Load i与设置的阈值进行比较,如果某个节点的负载值低于L th阈值,则将该节点加入到低负载节点队列中。 Then, the load value Load i of each node is compared with the set threshold, and if the load value of a certain node is lower than the L th threshold, the node is added to the low-load node queue.
然后,按照Load i值由小到大构成目标机选择队列D m={d 1,d 2,……,d z}。 Then, the target machine selection queue D m ={d 1 , d 2 ,..., d z } is formed according to the value of Load i from small to large.
最后,从D m队列中选取目标机。对D m队列中的Load值按升序进行排列,按Load i从小到大的顺序进行目标机的选择。 Finally, select the target machine from the D m queue. Arrange the Load values in the D m queue in ascending order, and select the target machine in the descending order of Load i.
3)迁移的分区数3) Number of migrated partitions
1、如果高低负载队列节点数目相同,即S y=D m。则分别将高低负载队列中的节点按照顺序进行匹配并行迁移,迁移的分区数公式如下: 1. If the number of high and low load queue nodes is the same, that is, Sy = D m . Then the nodes in the high and low load queues are matched and migrated in sequence in parallel, and the formula for the number of migrated partitions is as follows:
Figure PCTCN2020090554-appb-000048
Figure PCTCN2020090554-appb-000048
其中N q代表迁移的分区数,N y代表源机中的分区数,N m代表目标机中的分区数。 Where N q represents the number of partitions to be migrated, N y represents the number of partitions in the source machine, and N m represents the number of partitions in the target machine.
2、如果高负载队列节点数目大于低负载节点数目,即S y>D m。则适当调整低负载阈值,使低负载节点队列中的节点数目等于或近大于高负载节点队列中的节点数目,接着按照公式1-16设定迁移的分区数。 2. If the number of high-load queue nodes is greater than the number of low-load nodes, that is, Sy > D m . Then adjust the low-load threshold appropriately so that the number of nodes in the low-load node queue is equal to or nearly greater than the number of nodes in the high-load node queue, and then the number of migrated partitions is set according to Formula 1-16.
3、如果高负载队列节点数目远小于低负载节点数目,即S y<D m。则适当调整高负载阈值,使高负载节点队列中的节点数目等于或近小于低负载节点队列中的节点数目,接着按照公式1-16设定迁移的分区数。 3. If the number of high-load queue nodes is much smaller than the number of low-load nodes, that is, Sy <D m . Then adjust the high load threshold appropriately so that the number of nodes in the queue of high load nodes is equal to or nearly less than the number of nodes in the queue of low load nodes, and then the number of migration partitions is set according to formula 1-16.
4、得到匹配的源机和目标机,并且知道了每组中源机应迁移的分区数,就可以并行进行迁移,减少迁移开销。4. After obtaining the matched source and target machines, and knowing the number of partitions that the source machine should migrate in each group, the migration can be performed in parallel to reduce the migration overhead.
通过以上步骤,系统可实现负载均衡。对于增删节点的突发情况,同样可以采用此种迁移策略。Through the above steps, the system can achieve load balancing. For unexpected situations where nodes are added or deleted, this migration strategy can also be adopted.
分布式内存数据库MemSql采用主从结构,使用Hash作为存储方式,以数据分区Partition作为最小的存储单元块。Spark同样采用主从结构,Master节点(主节点)管理整个集群的资源,Worker节点(从节点)管理各计算节点的资源,定期向Master节点汇报节点资源情况,并启动Executor进行计算。The distributed memory database MemSql adopts a master-slave structure, uses Hash as a storage method, and uses a data partition as the smallest storage unit block. Spark also uses a master-slave structure. The Master node (master node) manages the resources of the entire cluster, and the Worker node (slave node) manages the resources of each computing node, regularly reports the node resource status to the Master node, and starts the Executor to perform calculations.
目前,Spark与MemSql有两种结合方式的应用场景:一种为Spark与MemSql是两个相对独立框架,另一种为Spark与MemSql集成框架。Currently, Spark and MemSql have two combined application scenarios: one is Spark and MemSql, which are two relatively independent frameworks, and the other is the integration framework of Spark and MemSql.
针对Spark-MemSql集成框架下的应用场景,如图6所示,采用本地化读取数据分析的方式,通过MemSql Spark Connector组件将二者集成,后台启动该组件做为守护进程将Spark中Master与MemSql中主汇聚器连接起来,然后Spark的Worker节点可以通过Master节点获取到MemSql中主汇聚器的元数据信息,其元数据包括数据存在哪些节点上及节点上有哪些分区,从而保证了实际程序进行数据分析过程中,Spark的Worker节点利用MemSqlRDD接口本地化并行地从MemSql存储Leaf节点进行数据读写、计算分析。MemSql中最小存储粒度是Partition,目前默认给每个节点分配相同的Partition数,这会因为集群节点的异构性导致节点处理能力的不同而存在节点之间的数据倾斜问题。由于此框架中的Spark采用本地化分析数据的方式,即数据在哪个节点上面,就在相应节点上面分析处理,MemSql中的Partitions数目直接反映Spark中的RDD task数目,即任务量与分区数呈正相关关系,如果采用默认分区方式会导致严重负载不均衡现象,如某个负载高的数据节点中有很多分区块需要处理分析,那么整个作业的执行时间就会变长,因为Spark作业调度的执行时间是到所有作业完成时刻截止。现实应用中,数据倾斜问题普遍存在,由其引起的处理节点负载不均衡是Spark-MemSql框架应用不可避免的问题。For the application scenario under the Spark-MemSql integration framework, as shown in Figure 6, the method of localized data reading and analysis is adopted, and the two are integrated through the MemSql Spark Connector component. The component is started in the background as a daemon process to connect the Master and Spark in Spark. The main aggregator in MemSql is connected, and then the Worker node of Spark can obtain the metadata information of the main aggregator in MemSql through the Master node. The metadata includes which nodes the data exists and which partitions on the nodes, thereby ensuring the actual program In the process of data analysis, the Worker node of Spark uses the MemSqlRDD interface to localize and perform data reading and writing, calculation and analysis from the MemSql storage Leaf node in parallel. The smallest storage granularity in MemSql is Partition. At present, each node is assigned the same number of Partitions by default. This will cause data skew between nodes due to the heterogeneity of cluster nodes resulting in different processing capabilities of nodes. Since Spark in this framework adopts the method of localized data analysis, that is, on which node the data is located, it is analyzed and processed on the corresponding node. The number of Partitions in MemSql directly reflects the number of RDD tasks in Spark, that is, the number of tasks and the number of partitions are positive. Related relationship, if the default partitioning method is adopted, it will cause serious load imbalance. For example, if there are many sub-blocks in a high-load data node that need to be processed and analyzed, then the execution time of the entire job will become longer because of the execution of Spark job scheduling. The time is up to the time when all jobs are completed. In real applications, the problem of data skew is widespread, and the unbalanced load of processing nodes caused by it is an inevitable problem in the application of the Spark-MemSql framework.
因此,在面向并行计算框架Spark的应用场景中,需要提出MemSql分区策略来改善负载均衡性,提高应用的响应速度。Therefore, in the application scenario of the parallel computing framework Spark, it is necessary to propose a MemSql partition strategy to improve load balance and increase the response speed of the application.
实施例:Examples:
针对Spark-MemSql集成框架,在局域网下部署Spark-MemSql集成集群环境,实验中共有5个节点,设定总分区数为32个分区,利用某制造企业中的数据集,验证基于负载预测和AHP、熵值集成权重法结合的数据动态分区策略的有效性。For the Spark-MemSql integration framework, the Spark-MemSql integrated cluster environment is deployed under the local area network. There are 5 nodes in the experiment, and the total number of partitions is set to 32 partitions. Using a data set in a manufacturing company, the verification is based on load forecasting and AHP The effectiveness of the data dynamic partition strategy combined with the entropy integrated weight method.
本实施例采用某制造企业中表FIS_PRODUCT作为测试数据集,如表1所示,大约有5000多万行数据。每条数据包括时间ID、厂类别、产品类别、产品长度、产品拉伸长度、产品重量等。其中LENGTH和WEIGHT两列作为关联分析应用测试的数据集,LENGTH、DRAWLENGTH、WEIGHT三列可作为Kmeans应用测试的数据集,不同的应用利用不同的数据集做测试。In this embodiment, the FIS_PRODUCT table in a certain manufacturing company is used as the test data set. As shown in Table 1, there are approximately more than 50 million rows of data. Each piece of data includes time ID, plant category, product category, product length, product stretch length, product weight, etc. Among them, the LENGTH and WEIGHT columns are used as the data set for the correlation analysis application test. The LENGTH, DRAWLENGTH and WEIGHT columns can be used as the data set for the Kmeans application test. Different applications use different data sets for testing.
表1Table 1
Figure PCTCN2020090554-appb-000049
Figure PCTCN2020090554-appb-000049
(1)对预测模块进行测试验证。通过运行相关应用来模仿Spark-MemSql集成框架下的实际应用环境,在应用环境下对负载利用率进行预测,每5s为一个周期,然后计算预测与实际值之间的偏方差来调整平滑系数,为分区策略的对比测试做铺垫,同时验证预测算法在此应用场景下的有效性。预测模块测试流程为:读取采集的历史负载信息,然后利用二次平滑预测算法对负载进行预测,计算预测值与真实值之间的偏方差S,通过调整平滑系数α来降低偏方差S。对于不同的应用场景采用同样方法调整平滑系数。(1) Test and verify the prediction module. Run related applications to imitate the actual application environment under the Spark-MemSql integration framework, predict the load utilization rate in the application environment, every 5s is a cycle, and then calculate the deviation between the forecast and the actual value to adjust the smoothing coefficient. Pave the way for the comparison test of the partition strategy, and verify the effectiveness of the prediction algorithm in this application scenario. The test process of the prediction module is: read the collected historical load information, then use the secondary smoothing prediction algorithm to predict the load, calculate the partial variance S between the predicted value and the true value, and reduce the partial variance S by adjusting the smoothing coefficient α. Use the same method to adjust the smoothing coefficient for different application scenarios.
(2)对不同预分区策略进行性能对比测试。本实验中的应用分别为对LENGTH和WEIGHT两列做关联分析,这两列属性分别代表产品的长度和对应的重量,分析产品长度和重量之间的关联性;对LENGTH、DRAWLENGTH、WEIGHT三列做Kmeans聚类分析,这三列属性分别代表产品长度、产品拉伸长度及产品重量,通过聚类分析进行产品分类。通过比较默认预分区策略、负载预测+AHP权重法、负载预测+熵值权重法的预分区策略和负载预测+AHP与熵值集成权重法的四种不同预分区策略,然后分别统计执行相同应用的时间,验证方案的有效性。(2) Perform performance comparison tests on different pre-partitioning strategies. The application in this experiment is to perform correlation analysis on the two columns of LENGTH and WEIGHT. The attributes of these two columns represent the length of the product and the corresponding weight respectively, and the correlation between the length and weight of the product is analyzed; for the three columns of LENGTH, DRAWLENGTH, and WEIGHT Do Kmeans cluster analysis. These three columns of attributes represent product length, product stretch length, and product weight. Product classification is performed through cluster analysis. By comparing the default pre-partitioning strategy, load forecasting + AHP weighting method, load forecasting + entropy weighting method pre-partitioning strategy and load forecasting + AHP and entropy integrated weighting method of four different pre-partitioning strategies, and then statistically execute the same application. Time to verify the effectiveness of the program.
(3)在Spark-MemSql框架中如果出现集群负载不均衡现象,通过迁移策略进行源机和目标机之间数据分区块的迁移,运行相同的应用程序,对迁移前后进行性能对比,验证方案的有效性。(3) If cluster load imbalance occurs in the Spark-MemSql framework, use the migration strategy to migrate the data in blocks between the source machine and the target machine, run the same application, compare the performance before and after the migration, and verify the plan. Effectiveness.
实施步骤一:负载预测算法。对不同的应用分别进行测试,对某节点负载进行采集和预测,验证此预测算法在不同应用场景下的有效性,并分别获得不同应用场景下的不同负载指标的平滑系数α。从图7和图8所示,两种不同的应用CPU利用率都存在波动,二次平滑指数法能够较准确地预测CPU的利用率,可以避免瞬时峰值的影响。采用同样的方法对不同的指标进行预测对比,最终获得不同应用场景下的不同指标的平滑系数α,如表2和3所示。Implementation step 1: Load prediction algorithm. Test different applications separately, collect and predict the load of a certain node, verify the effectiveness of this prediction algorithm in different application scenarios, and obtain the smoothing coefficient α of different load indicators in different application scenarios. As shown in Figure 7 and Figure 8, there are fluctuations in the CPU utilization of the two different applications. The quadratic smoothing index method can more accurately predict the CPU utilization and avoid the impact of instantaneous peaks. The same method is used to predict and compare different indicators, and finally the smoothing coefficient α of different indicators in different application scenarios is obtained, as shown in Tables 2 and 3.
表2Table 2
指标index CPUCPU 内存RAM 带宽bandwidth
平滑系数αSmoothing coefficient α 0.70.7 0.350.35 0.750.75
表3table 3
指标index CPUCPU 内存RAM 带宽bandwidth
平滑系数αSmoothing coefficient α 0.750.75 0.400.40 0.650.65
实施步骤二:预分区策略。通过不同的预分区策略进行分区,分成两组实验,每组实验运行相同的应用程序,第一组实验进行关联分析的应用;第二组实验进行Kmeans聚类分析的应用。对不同分区策略下应用程序的执行时间进行对比,验证方案的有效性。Implementation step two: pre-partitioning strategy. Divided into two groups of experiments through different pre-partitioning strategies. Each group of experiments runs the same application. The first group of experiments is used for the application of association analysis; the second group of experiments is used for the application of Kmeans cluster analysis. Compare the execution time of applications under different partitioning strategies to verify the effectiveness of the scheme.
(1)利用AHP得出每种指标的权重(1) Use AHP to get the weight of each indicator
首先,输入指标决策矩阵A:First, enter the index decision matrix A:
Figure PCTCN2020090554-appb-000050
Figure PCTCN2020090554-appb-000050
评判采用列与行进行两两比较,其中A 1、A 2、A 3分别代表CPU、内存、带宽;然后,计算随机一致性比率C.R.=C.I./R.I.=0.00103<0.1,说明对比矩阵保持一致性,决策矩阵设计合理;接着,利用AHP获得每种指标的权重值;然后,通过执行应用过程中周期性采集每种指标利用率,再利用熵值法获得每种指标的权重值;最后通过多次实验调整设置权重系数β为0.8,得到集成权重值,不同的应用场景结果分别如表4和5所示。 The evaluation uses column and row comparisons in pairs, where A 1 , A 2 , and A 3 represent CPU, memory, and bandwidth respectively; then, calculate the random consistency ratio CR=CI/RI=0.00103<0.1, indicating that the comparison matrix maintains consistency , The design of the decision matrix is reasonable; then, the AHP is used to obtain the weight value of each indicator; then, the utilization rate of each indicator is periodically collected during the application process, and then the entropy method is used to obtain the weight value of each indicator; In this experiment, the weight coefficient β was adjusted to 0.8, and the integrated weight value was obtained. The results of different application scenarios are shown in Tables 4 and 5, respectively.
表4Table 4
指标index CPUCPU 内存RAM 带宽bandwidth
AHP权重值AHP weight value 61.523%61.523% 31.872%31.872% 6.604%6.604%
熵值法权重值Entropy method weight value 38.231%38.231% 19.076%19.076% 42.693%42.693%
AHP+熵值法权重值AHP+entropy method weight value 57.762%57.762% 29.62%29.62% 13.518%13.518%
表5table 5
指标index CPUCPU 内存RAM 带宽bandwidth
AHP权重值AHP weight value 61.523%61.523% 31.872%31.872% 6.604%6.604%
熵值法权重值Entropy method weight value 43.231%43.231% 24.076%24.076% 32.693%32.693%
AHP+熵值法权重值AHP+entropy method weight value 58.762%58.762% 30.52%30.52% 11.518%11.518%
(2)根据具体应用中预测的每种指标负载值和不同权重方法,并结合公式1-13获得不同分区策略下每个节点的处理能力,在结合公式1-14得出每个节点分区数占比,即可得到每个节点的分区数,如表6所示:(2) According to the predicted load value of each index and different weighting methods in the specific application, combined with formula 1-13 to obtain the processing capacity of each node under different partitioning strategies, the number of partitions for each node is obtained by combining formula 1-14 Percentage, you can get the number of partitions of each node, as shown in Table 6:
表6Table 6
Figure PCTCN2020090554-appb-000051
Figure PCTCN2020090554-appb-000051
Figure PCTCN2020090554-appb-000052
Figure PCTCN2020090554-appb-000052
通过图9、图10所示,分别执行关联分析和Kmeans聚类应用。从整体上看默认分区策略效果最差,本文研究设计的预测+AHP与熵值权重集成法的分区策略效果最好,并且随着数据量的增加,效果越显著。AHP权重法是主观权重法,没有根据实际应用场景进行权重配比,有失客观性;熵值权重法是利用指标值的差异性获得的,内存利用率变化较缓慢,但一直处于频繁使用状态,Spark-MemSql框架的数据计算都是在内存中进行的,因此内存一直比较稳定的使用,而带宽利用率变化程度较大,但利用率很低,如果只采用客观法会导致内存权重小、带宽权重较大的错误结果。因此集成主客观权重会带来更好的结果。执行不同的应用取得了同样的效果,说明本文研究的预分区策略在处理相对独立任务的应用上具有推广性。As shown in Figure 9 and Figure 10, the association analysis and Kmeans clustering application are performed respectively. On the whole, the default partition strategy has the worst effect. The partition strategy of the prediction + AHP and entropy weight integration method designed in this paper has the best effect, and the effect is more significant as the amount of data increases. The AHP weighting method is a subjective weighting method, which does not match the weights according to the actual application scenarios, which is unobjective; the entropy weighting method is obtained by using the difference of the index value, and the memory utilization rate changes slowly, but it has been frequently used. , The data calculation of the Spark-MemSql framework is carried out in the memory, so the memory has been used relatively steadily, and the bandwidth utilization rate varies greatly, but the utilization rate is very low. If only the objective method is used, the memory weight will be small, Error results with larger bandwidth weights. Therefore, integrating subjective and objective weights will bring better results. Executing different applications has achieved the same effect, which shows that the pre-partitioning strategy studied in this paper has generalization in the application of processing relatively independent tasks.
通过图11和12所示,分别针对不同预分区策略执行相同的应用,计算整个应用过程中每个节点的整体平均负载利用率。从整体上看默认分区策略出现严重的负载不均衡现象,预测+AHP、预测+熵值法、预测+AHP与熵值权重集成法结合的预分区策略都能较好地解决集群负载问题,实现集群负载的均衡性。As shown in Figures 11 and 12, the same application is executed for different pre-partitioning strategies, and the overall average load utilization rate of each node in the entire application process is calculated. On the whole, the default partitioning strategy has serious load imbalance. The pre-partitioning strategy combining prediction + AHP, prediction + entropy method, prediction + AHP and entropy weight integration method can solve the cluster load problem well. The balance of cluster load.
实施步骤三:迁移策略。在Spark-MemSql框架中遇到负载不均衡现象,通过数据迁移策略,然后运行相同的应用程序,通过监控界面周期性记录不同节点的负载状况,对迁移前后应用程序的执行时间进行对比,并考虑了迁移的时间开销,验证方案的有效性。Implementation step three: migration strategy. When encountering load imbalance in the Spark-MemSql framework, through the data migration strategy, and then run the same application, periodically record the load status of different nodes through the monitoring interface, compare the execution time of the application before and after the migration, and consider The time cost of the migration is reduced, and the effectiveness of the scheme is verified.
利用迁移策略构造高低负载队列,并获得不同节点应接收或者发送的分区块数,执行迁移操作后,每个节点的分区数如表7所示。Use the migration strategy to construct high and low load queues, and obtain the number of sub-blocks that different nodes should receive or send. After performing the migration operation, the number of partitions for each node is shown in Table 7.
表7Table 7
Figure PCTCN2020090554-appb-000053
Figure PCTCN2020090554-appb-000053
通过图13、图14所示,表明了迁移策略的有效性,可以改善集群的负载均衡性,一定程度上提高了应用的响应速度。相关应用中,在数据量较少时,即关联分析应用中数据量小于3000万条时和Kmeans分析应用中数据量小于2000万条时负载没有达到设定的阈值,不触发迁移,但是当执行的数据量相对较大时,即关联分析应用中数据量达到3000万条和Kmeans分析应用中数据量达到2000万条时负载达到阈值,触发迁移,虽然改善了集群的负载均衡,但是迁移消耗时间成本,导致总时间较长,当数据量进一步增大时,负载不均衡性加剧,导致迁移开销相对较小,提升了应用的响应速度。Figure 13 and Figure 14 show the effectiveness of the migration strategy, which can improve the load balance of the cluster and improve the response speed of the application to a certain extent. In related applications, when the amount of data is small, that is, when the amount of data in the correlation analysis application is less than 30 million, and when the amount of data in the Kmeans analysis application is less than 20 million, the load does not reach the set threshold, and migration is not triggered, but when it is executed When the amount of data is relatively large, that is, when the amount of data in the correlation analysis application reaches 30 million and the data volume in the Kmeans analysis application reaches 20 million, the load reaches the threshold and the migration is triggered. Although the load balance of the cluster is improved, the migration takes time Cost, resulting in a longer total time. When the amount of data further increases, load imbalance intensifies, resulting in relatively small migration overhead and improving the response speed of the application.
通过图15和图16所示,对不同应用进行迁移测试,比较迁移前和迁移后整个应用过程中每个节点的整体平均负载利用率,可以看出通过迁移能改善集群负载的均衡性。As shown in Figure 15 and Figure 16, the migration test is performed on different applications, and the overall average load utilization rate of each node in the entire application process before and after the migration is compared. It can be seen that the migration can improve the balance of the cluster load.
在数据预分区阶段,得到基于负载预测+AHP指标权重判定结合的分区策略效果最好,能解决集群的负载均衡性,更能提高应用的响应速度;在数据已分布完成,但出现负载不均衡的情况,通过迁移能解决集群的负载均衡性,提高应用的响应速度。In the data pre-partitioning stage, the partitioning strategy based on load prediction + AHP index weight judgment has the best effect, which can solve the load balance of the cluster and improve the response speed of the application; after the data has been distributed, the load is unbalanced In the case of migration, the load balance of the cluster can be solved and the response speed of the application can be improved.
本发明提供了一种基于节点负载的数据动态分区系统,具体实现该技术方案的方法和途径很多,以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。本实施例中未明确的各组成部分均可用现有技术加以实现。The present invention provides a data dynamic partition system based on node load. There are many specific methods and ways to implement this technical solution. The above are only the preferred embodiments of the present invention. It should be pointed out that for those of ordinary skill in the art, In other words, without departing from the principle of the present invention, several improvements and modifications can be made, and these improvements and modifications should also be regarded as the protection scope of the present invention. All the components that are not clear in this embodiment can be implemented using existing technology.

Claims (9)

  1. 一种基于节点负载的数据动态分区系统,其特征在于,包括负载监测模块、采集模块、数据预分区模块和数据迁移模块;A data dynamic partition system based on node load, which is characterized by comprising a load monitoring module, an acquisition module, a data pre-partitioning module and a data migration module;
    所述负载监测模块用于,选取负载信息指标,并实时性地监测分布式集群中每个节点上的负载信息指标值;The load monitoring module is used to select load information indicators and monitor the load information indicator values on each node in the distributed cluster in real time;
    所述采集模块用于,周期性的采集分布式集群中每个节点上的负载信息指标值;The collection module is used to periodically collect the load information index value on each node in the distributed cluster;
    所述数据预分区模块用于,预测分布式集群中每个节点上的负载信息指标值,然后根据指标权重方法得到每个节点的处理能力,最后根据每个节点的处理能力分布不同的数据量,完成数据预分区;The data pre-partitioning module is used to predict the load information index value on each node in the distributed cluster, and then obtain the processing capacity of each node according to the index weight method, and finally distribute different amounts of data according to the processing capacity of each node , Complete data pre-partitioning;
    所述数据迁移模块用于,当分布式集群出现负载不均衡问题时,触发节点之间的数据迁移来提高负载均衡性。The data migration module is used to trigger data migration between nodes to improve load balance when a load imbalance problem occurs in the distributed cluster.
  2. 根据权利要求1所述的系统,其特征在于,所述负载监测模块选取CPU利用率、内存利用率及带宽的利用率作为负载信息指标值,通过部署Memsql资源监测服务实时性地监测分布式集群中每个节点上的负载信息指标值。The system according to claim 1, wherein the load monitoring module selects CPU utilization, memory utilization, and bandwidth utilization as load information index values, and monitors distributed clusters in real time by deploying Memsql resource monitoring services The load information index value on each node in the.
  3. 根据权利要求2所述的系统,其特征在于,所述采集模块通过分布式Yarn资源管理组件提供的API周期性获得分布式集群中每个节点上的负载信息指标值,并保存到数据库中。The system according to claim 2, wherein the collection module periodically obtains the load information index value on each node in the distributed cluster through the API provided by the distributed Yarn resource management component, and saves it in the database.
  4. 根据权利要求3所述的系统,其特征在于,所述数据预分区模块用于,预测分布式集群中每个节点上的负载信息指标值,然后根据AHP和熵值主客观指标权重集成方法得到每个节点的处理能力,最后根据每个节点的处理能力分布不同的数据量,完成数据预分区,具体包括如下步骤:The system according to claim 3, wherein the data pre-partitioning module is used to predict the load information index value on each node in the distributed cluster, and then obtain it according to the AHP and entropy subjective and objective index weight integration method The processing capacity of each node, and finally, according to the processing capacity of each node, different data volumes are distributed to complete the data pre-partitioning, which specifically includes the following steps:
    步骤1,采用二次指数平滑法进行负载信息指标值预测:Step 1. Use the quadratic exponential smoothing method to predict the load information index value:
    一次指数平滑法公式如下:The formula of the one-time exponential smoothing method is as follows:
    Figure PCTCN2020090554-appb-100001
    Figure PCTCN2020090554-appb-100001
    二次指数平滑法公式如下:The formula of the quadratic exponential smoothing method is as follows:
    Figure PCTCN2020090554-appb-100002
    Figure PCTCN2020090554-appb-100002
    综合一、二次指数平滑公式,得出第T个周期的负载预测值,公式如下:Combining the first and second exponential smoothing formulas, the load forecast value of the T-th cycle is obtained, and the formula is as follows:
    Figure PCTCN2020090554-appb-100003
    Figure PCTCN2020090554-appb-100003
    其中,Y j是第j个周期的负载信息指标值的实际值,
    Figure PCTCN2020090554-appb-100004
    Figure PCTCN2020090554-appb-100005
    分别是第j-1个周期的负载信息指标值的预测值和第j个周期的负载信息指标值的预测值,
    Figure PCTCN2020090554-appb-100006
    Figure PCTCN2020090554-appb-100007
    分别为第j-1个周期的二次指数平滑值和第j个周期的二次指数平滑值,
    Figure PCTCN2020090554-appb-100008
    是第j+T个周期的负载信息指标值的预测值;a j和b j是中间参数;α是平滑系数;
    Among them, Y j is the actual value of the load information index value of the jth cycle,
    Figure PCTCN2020090554-appb-100004
    with
    Figure PCTCN2020090554-appb-100005
    They are the predicted value of the load information index value of the j-1 cycle and the predicted value of the load information index value of the j cycle,
    Figure PCTCN2020090554-appb-100006
    with
    Figure PCTCN2020090554-appb-100007
    Are the second exponential smoothing value of the j-1 period and the second exponential smoothing value of the j period,
    Figure PCTCN2020090554-appb-100008
    Is the predicted value of the load information index value of the j+T cycle; a j and b j are intermediate parameters; α is the smoothing coefficient;
    采集模块将数据库中的前n-1个周期采集的分布式集群中每个节点上的负载信息指标值发送给数据预分区模块,并与当前周期的每个节点上的负载信息指标值组成大小为n的负载数据,取第一个周期测量的负载信息指标值的实际值作为初值Y j、一次预测初值及二次预测初值,使用得到 的n个负载数据预测未来d个周期的每个节点上的负载信息指标值,计算一个节点未来d个周期的负载信息指标值的平均值P,最终确定集群中每个节点的负载信息指标值; The collection module sends the load information index value on each node in the distributed cluster collected in the first n-1 cycles in the database to the data pre-partitioning module, and it is combined with the load information index value on each node in the current cycle. Is the load data of n, take the actual value of the load information index value measured in the first cycle as the initial value Y j , the initial value of the first prediction and the initial value of the second prediction, and use the obtained n load data to predict the future d cycles The load information index value on each node, calculate the average value P of the load information index value of a node in the future d cycles, and finally determine the load information index value of each node in the cluster;
    步骤2,计算每个节点的处理能力;Step 2. Calculate the processing capacity of each node;
    步骤3,根据每个节点的处理能力分布不同的数据量。Step 3: Distribute different amounts of data according to the processing capacity of each node.
  5. 根据权利要求4所述的系统,其特征在于,步骤1中,通过计算标准偏差S来获得平滑系数的值:The system according to claim 4, wherein in step 1, the value of the smoothing coefficient is obtained by calculating the standard deviation S:
    Figure PCTCN2020090554-appb-100009
    Figure PCTCN2020090554-appb-100009
    其中,n代表取的周期数,通过调整平滑系数α值来计算偏方差S,取S最小时对应的平滑系数α值。Among them, n represents the number of cycles taken, the partial variance S is calculated by adjusting the value of the smoothing coefficient α, and the value of the corresponding smoothing coefficient α is taken when S is the smallest.
  6. 根据权利要求5所述的系统,其特征在于,步骤2包括如下步骤:The system according to claim 5, wherein step 2 includes the following steps:
    步骤2-1,使用AHP主观权重方法进行计算:多属性决策中,由决策者对所有评价指标进行两两比较,得到判断矩阵U=(A ij) n×n,其中A ij为评价指标A i与A j比较而得的数值,取值为1至9之间的奇数,即取值为1、3、5、7、9时分别表示前者指标比后者指标同等重要、较重要、很重要、非常重要、极其重要;当取值为1至9之间的偶数时,分别表示两两相比的重要程度介于两个相邻奇数所表示重要性程度之间,即取值为2时表示两两相比的重要程度介于两个相邻奇数1和3所表示重要性程度之间,且
    Figure PCTCN2020090554-appb-100010
    Step 2-1, use the AHP subjective weight method to calculate: In multi-attribute decision-making, the decision maker compares all evaluation indicators pairwise to obtain the judgment matrix U=(A ij ) n×n , where A ij is the evaluation index A The value obtained by comparing i with A j is an odd number between 1 and 9, that is, when the value is 1, 3, 5, 7, and 9, respectively, it means that the former index is equally important, more important, and very important than the latter index. Important, very important, and extremely important; when the value is an even number between 1 and 9, it means that the importance of the pairwise comparison is between the importance degrees of two adjacent odd numbers, that is, the value is 2. When indicates that the importance of the pairwise comparison is between the importance of two adjacent odd numbers 1 and 3, and
    Figure PCTCN2020090554-appb-100010
    对CPU利用率、内存利用率和带宽利用率两两比较,得到判断矩阵A:Comparing CPU utilization, memory utilization and bandwidth utilization in pairs, the judgment matrix A is obtained:
    Figure PCTCN2020090554-appb-100011
    Figure PCTCN2020090554-appb-100011
    其中,A 1,A 2,A 3分别代表一个节点的CPU利用率对节点整体负载影响的权重值、内存利用率对节点整体负载影响的权重值和带宽利用率对节点整体负载影响的权重值。对判断矩阵A每列进行归一化操作,求取列特征向量,再对每行进行归一化操作,求取行特征向量,最后得出每种指标的权重配比,并对判断矩阵A进行一致性检验,最终得到一个节点的CPU、内存和带宽的主观权重分别为WS 1,WS 2,WS 3,并且WS 1+WS 2+WS 3=1; Among them, A 1 , A 2 , and A 3 represent the weight value of the impact of a node's CPU utilization on the overall load of the node, the weight value of the impact of memory utilization on the overall load of the node, and the weight of the impact of bandwidth utilization on the overall load of the node. . Perform a normalization operation on each column of the judgment matrix A, obtain the column eigenvectors, and then perform a normalization operation on each row, obtain the row eigenvectors, and finally obtain the weight ratio of each indicator, and compare the judgment matrix A Perform consistency check, and finally get the subjective weights of a node's CPU, memory, and bandwidth as WS 1 , WS 2 , WS 3 , and WS 1 +WS 2 +WS 3 =1;
    步骤2-2,计算矩阵的特征向量和指标权重:Step 2-2, calculate the eigenvectors and index weights of the matrix:
    对矩阵各列求和,列和的向量为:SUM jSum the columns of the matrix, the vector of the column sum is: SUM j ;
    对矩阵每一列进行归一化处理,公式如下:To normalize each column of the matrix, the formula is as follows:
    Figure PCTCN2020090554-appb-100012
    Figure PCTCN2020090554-appb-100012
    ∑A ij的值为各列的和SUM j,B ij表示A ij归一化后的数据,根据B ij得到新矩阵B,B矩阵中每一列值的和都为1; The value of ∑A ij is the sum of each column SUM j, and B ij represents the normalized data of A ij. A new matrix B is obtained according to B ij , and the sum of the values of each column in the B matrix is 1;
    对矩阵B每一行求和,即得出特征向量SUM iSum each row of matrix B to obtain the eigenvector SUM i ;
    计算指标权重,对特征向量进行归一化处理,公式如下:Calculate the index weight and normalize the feature vector, the formula is as follows:
    Figure PCTCN2020090554-appb-100013
    Figure PCTCN2020090554-appb-100013
    根据上述公式,最终得到三种指标权重分别为W 1,W 2,W 3According to the above formula, the three index weights are finally obtained as W 1 , W 2 , W 3 ;
    步骤2-3,进行矩阵一致性检验:Step 2-3, check the consistency of the matrix:
    计算矩阵的最大特征根,公式如下:Calculate the largest characteristic root of the matrix, the formula is as follows:
    Figure PCTCN2020090554-appb-100014
    Figure PCTCN2020090554-appb-100014
    其中,λ max为最大特征根,AW表示矩阵A和权重向量W相乘,得到一个列向量,n代表矩阵阶数,W代表权重向量; Among them, λ max is the maximum eigenvalue, AW represents the matrix A and the weight vector W are multiplied to obtain a column vector, n represents the order of the matrix, and W represents the weight vector;
    计算判断矩阵的一致性指标,公式如下:To calculate the consistency index of the judgment matrix, the formula is as follows:
    Figure PCTCN2020090554-appb-100015
    Figure PCTCN2020090554-appb-100015
    其中,C.I.代表一致性指标,n表示矩阵的阶数;Among them, C.I. represents the consistency index, and n represents the order of the matrix;
    计算随机一致性比率C.R.,计算公式如下:Calculate the random consistency ratio C.R., the calculation formula is as follows:
    Figure PCTCN2020090554-appb-100016
    Figure PCTCN2020090554-appb-100016
    其中,R.I.代表平均随机一致性指标,是一个常量,根据阶数可以在量表里查询;3阶R.I.=0.89,如果C.R.<0.1,说明对比矩阵保持一致性;如果C.R.>0.1,则表示对比矩阵不具有一致性,需要进行调整;Among them, RI stands for the average random consistency index, which is a constant, which can be queried in the scale according to the order; the third-order RI=0.89, if CR<0.1, it means that the comparison matrix is consistent; if CR>0.1, it means comparison The matrix is not consistent and needs to be adjusted;
    步骤2-4,进行熵值法客观权重法计算:Steps 2-4, calculate the objective weight method by entropy method:
    构建负载信息决策矩阵M:Construct load information decision matrix M:
    Figure PCTCN2020090554-appb-100017
    Figure PCTCN2020090554-appb-100017
    其中,CUR n、MUR n、BUR n分别表示一个节点的第n个周期预测的CPU利用率、内存利用率和带宽的利用率; Among them, CUR n , MUR n , and BUR n respectively represent the CPU utilization, memory utilization, and bandwidth utilization predicted in the nth cycle of a node;
    对决策矩阵M每列进行标准化处理得到决策矩阵R:Standardize each column of the decision matrix M to obtain the decision matrix R:
    Figure PCTCN2020090554-appb-100018
    Figure PCTCN2020090554-appb-100018
    其中
    Figure PCTCN2020090554-appb-100019
    R i1表示决策矩阵R第i行第1列的元素,决策矩阵R每一列满足归一性,即
    Figure PCTCN2020090554-appb-100020
    即每一列值的和为1,j=1,2,3;
    among them
    Figure PCTCN2020090554-appb-100019
    R i1 represents the element in the ith row and the first column of the decision matrix R, and each column of the decision matrix R satisfies the normalization, namely
    Figure PCTCN2020090554-appb-100020
    That is, the sum of the values in each column is 1, j = 1, 2, 3;
    根据如下公式计算负载信息指标的熵:Calculate the entropy of the load information index according to the following formula:
    Figure PCTCN2020090554-appb-100021
    Figure PCTCN2020090554-appb-100021
    E j代表负载信息指标的熵值,常数K=1/ln(n),则0≤E j≤1,即E j最大为1,j为1时,E j表示CPU利用率的熵值;j为2时,E j表示内存利用率的熵值;j为3时,E j表示带宽的利用率的熵值; E j represents the entropy value of the load information index, and the constant K = 1/ln(n), then 0≤E j ≤1, that is, the maximum E j is 1, and when j is 1, E j represents the entropy value of the CPU utilization; When j is 2, E j represents the entropy value of memory utilization; when j is 3, E j represents the entropy value of bandwidth utilization;
    定义D j为第j个负载信息指标E j的贡献度:D j=1-E jDefine D j as the contribution degree of the j-th load information index E j : D j =1-E j ;
    步骤2-5,计算每种负载信息指标的客观权重值WO jStep 2-5, calculate the objective weight value WO j of each load information index:
    Figure PCTCN2020090554-appb-100022
    Figure PCTCN2020090554-appb-100022
    WO 1,WO 2,WO 3分别代表CPU对于节点负载影响的客观权重值、内存对于节点负载影响的客观权重值和带宽对于节点负载影响的客观权重值,并且WO 1+WO 2+WO 3=1; WO 1 , WO 2 , WO 3 respectively represent the objective weight value of CPU's impact on node load, the objective weight value of memory on node load, and the objective weight of bandwidth on node load, and WO 1 +WO 2 +WO 3 = 1;
    步骤2-6,计算节点的最终的负载信息指标的权重w i Step 2-6, calculate the weight w i of the final load information index of the node:
    w i=β×WS i+(1-β)×WO i,    (1-12) w i =β×WS i +(1-β)×WO i , (1-12)
    其中β为主客观权重调整系数,w i为最终节点负载的权重,其中i=1,2,3,并且w 1+w 2+w 3=1,w 1表示最终的CPU利用率的权重,w 2表示最终的内存利用率的权重,w 3表示最终的带宽的利用率的权重; Where β is the subjective and objective weight adjustment coefficient, w i is the weight of the final node load, where i = 1, 2, 3, and w 1 + w 2 + w 3 = 1, w 1 represents the final CPU utilization weight, w 2 represents the weight of the final memory utilization, w 3 represents the weight of the final bandwidth utilization;
    步骤3-4,计算节点的处理能力:Step 3-4, compute the processing capacity of the node:
    CA i=w 1×(1-CAU i)+w 2×(1-MAU i)+w 3×(1-BAU i),    (1-13) CA i = w 1 ×(1-CAU i )+w 2 ×(1-MAU i )+w 3 ×(1-BAU i ), (1-13)
    其中,CAU i、MAU i、BAU i分别代表预测得到的第i个节点当前周期的CPU利用率、内存利用率、带宽利用率,CA i表示第i个节点处理能力。 Among them, CAU i , MAU i , and BAU i respectively represent the predicted CPU utilization, memory utilization, and bandwidth utilization of the i-th node in the current cycle, and CA i represents the processing capacity of the i-th node.
  7. 根据权利要求6所述的系统,其特征在于,步骤3包括:The system according to claim 6, wherein step 3 comprises:
    计算每个节点要分配的数据量的占比:Calculate the proportion of the amount of data to be distributed by each node:
    Figure PCTCN2020090554-appb-100023
    Figure PCTCN2020090554-appb-100023
    其中DP i代表第i个节点应分配的数据量占比,m表示节点总数。 Among them, DP i represents the proportion of the amount of data that should be allocated to the i-th node, and m represents the total number of nodes.
  8. 根据权利要求7所述的系统,其特征在于,所述数据迁移模块通过设置高、低负载阈值来作为触发数据迁移的条件,构造出源机和目标机的选择队列,在出现负载不均衡问题时,选择源机和目标机来进行数据迁移,源机作为待迁移数据的节点,目标机作为接受迁移数据的节点,并获得应迁移的数据量。The system according to claim 7, wherein the data migration module sets high and low load thresholds as conditions for triggering data migration, and constructs a selection queue for the source and target machines. When the load is unbalanced, When selecting the source machine and the target machine for data migration, the source machine is the node of the data to be migrated, the target machine is the node that accepts the data to be migrated, and the amount of data that should be migrated is obtained.
  9. 根据权利要求8所述的系统,其特征在于,所述数据迁移模块通过设置高、低负载阈值来作为触发数据迁移的条件,构造出源机和目标机的选择队列,在出现负载不均衡问题时,选择源机和目标机来进行数据迁移,源机作为待迁移数据的节点,目标机作为接受迁移数据的节点,并获得应迁移的数据量,具体包括如下步骤:The system according to claim 8, wherein the data migration module sets high and low load thresholds as conditions for triggering data migration, and constructs a selection queue for the source machine and the target machine. In the event of load imbalance, When selecting the source machine and the target machine for data migration, the source machine is the node of the data to be migrated, the target machine is the node that accepts the data to be migrated, and the amount of data to be migrated is obtained, which specifically includes the following steps:
    步骤a1,选择源机:Step a1, select the source machine:
    计算每个节点的整体负载值:Calculate the overall load value of each node:
    Load i=w 1×CUR i+w 2×MUR i+w 3×BUR i,    (1-15) Load i = w 1 ×CUR i +w 2 ×MUR i +w 3 ×BUR i , (1-15)
    其中,Load i表示第i个节点的整体负载值,将每个节点的整体负载值与设置的阈值H th进行比较,如果一个节点的整体负载值超过阈值H th,则将所述节点加入到高负载节点队列中,按照整体负载值由大到小构成源机选择队列S y={s 1,s 2,……,s m},s m表示队列S y中第m个节点,即整体负载值最小的节点; Wherein, Load i represents the overall load value of the i-th node. The overall load value of each node is compared with the set threshold H th . If the overall load value of a node exceeds the threshold H th , the node is added to high load queue node according to the descending integral source load value selection unit queues S y = {s 1, s 2, ......, s m}, s m S y represents a queue of the m-th node, i.e. a whole The node with the smallest load value;
    对S y队列中的每个节点,按整体负载值从大到小的顺序进行源机的选择; For each node in the Sy queue, select the source machine according to the overall load value in descending order;
    步骤a2,选择目标机:将每个节点的整体负载值与设置的阈值L th进行比较,如果一个节点的负载值低于阈值L th,则将所述节点加入到低负载节点队列中,按照整体负载值值由小到大构成目标机选择队列D m={d 1,d 2,……,d z},d z表示队列D m中第z个节点,即整体负载值最大的节点; Step a2, select the target machine: compare the overall load value of each node with the set threshold L th , if the load value of a node is lower than the threshold L th , add the node to the low-load node queue, according to The overall load value from small to large constitutes the target machine selection queue D m ={d 1 , d 2 ,..., d z }, d z represents the z-th node in the queue D m , that is, the node with the largest overall load value;
    对D m队列中的每个节点,按整体负载值从小到大的顺序进行目标机的选择; For each node in the D m queue, select the target machine in the order of the overall load value from small to large;
    步骤a3,进行数据迁移:Step a3, perform data migration:
    如果高、低负载队列节点数目相同,即m=z,则分别将高、低负载队列中的节点按照顺序进行匹配并行迁移,迁移的分区数公式如下:If the number of nodes in the high-load and low-load queues is the same, that is, m=z, the nodes in the high-load and low-load queues will be matched and migrated in parallel in sequence. The formula for the number of migrated partitions is as follows:
    Figure PCTCN2020090554-appb-100024
    Figure PCTCN2020090554-appb-100024
    其中N q代表迁移的分区数,N y代表源机中的分区数,N m代表目标机中的分区数; Where N q represents the number of partitions to be migrated, N y represents the number of partitions in the source machine, and N m represents the number of partitions in the target machine;
    如果高负载队列节点数目大于低负载节点数目,即S y>D m,则适当调整低负载阈值,使低负载节点队列中的节点数目等于或近大于高负载节点队列中的节点数目,接着按照公式1-16设定迁移的分区数; If the number of high-load queue nodes is greater than the number of low-load nodes, that is, Sy >D m , adjust the low-load threshold appropriately so that the number of nodes in the low-load node queue is equal to or nearly greater than the number of nodes in the high-load node queue, and then follow Formula 1-16 sets the number of partitions to be migrated;
    如果高负载队列节点数目远小于低负载节点数目,即S y<D m,则适当降低高负载阈值,使高负载节点队列中的节点数目等于或近小于低负载节点队列中的节点数目,接着按照公式1-16设定迁移的分区数; If the number of high-load queue nodes is much smaller than the number of low-load nodes, that is, Sy <D m , then appropriately lower the high-load threshold so that the number of nodes in the queue of high-load nodes is equal to or nearly less than the number of nodes in the queue of low-load nodes, and then Set the number of partitions to be migrated according to formula 1-16;
    获得源机应迁移的分区数后,即能够进行数据迁移。After obtaining the number of partitions that the source machine should migrate, the data can be migrated.
PCT/CN2020/090554 2019-10-15 2020-05-15 Node load-based dynamic data partitioning system WO2021073083A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910978247.3A CN110704542A (en) 2019-10-15 2019-10-15 Data dynamic partitioning system based on node load
CN201910978247.3 2019-10-15

Publications (1)

Publication Number Publication Date
WO2021073083A1 true WO2021073083A1 (en) 2021-04-22

Family

ID=69199661

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/090554 WO2021073083A1 (en) 2019-10-15 2020-05-15 Node load-based dynamic data partitioning system

Country Status (2)

Country Link
CN (1) CN110704542A (en)
WO (1) WO2021073083A1 (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342618A (en) * 2021-06-30 2021-09-03 深圳前海微众银行股份有限公司 Distributed monitoring cluster management method, device and computer readable storage medium
CN113590319A (en) * 2021-07-28 2021-11-02 北京金山云网络技术有限公司 Computing resource load balancing method and device for message queue
CN113608870A (en) * 2021-07-28 2021-11-05 北京金山云网络技术有限公司 Load balancing method and device of message queue, electronic equipment and storage medium
CN113608876A (en) * 2021-08-12 2021-11-05 中国科学技术大学 Distributed file system metadata load balancing method based on load type perception
CN113626282A (en) * 2021-07-16 2021-11-09 济南浪潮数据技术有限公司 Cloud computing physical node load monitoring method and device, terminal and storage medium
CN113780852A (en) * 2021-09-16 2021-12-10 东北大学 Diagnosis method for quality defects in plate and strip rolling process
CN113886081A (en) * 2021-09-29 2022-01-04 南京地铁建设有限责任公司 Station multi-face-brushing array face library segmentation method based on load balancing
CN113986557A (en) * 2021-11-15 2022-01-28 北京航空航天大学 Storage load balancing method and system for full-flow collection
CN114064281A (en) * 2021-11-22 2022-02-18 重庆邮电大学 Low-cost Spark actuator placement method based on BFD-VNS algorithm
CN114201296A (en) * 2021-12-09 2022-03-18 厦门美亚亿安信息科技有限公司 Data balancing method and system based on streaming processing platform
CN114268547A (en) * 2021-12-09 2022-04-01 中国电子科技集团公司第五十四研究所 Multi-attribute decision-making air emergency communication network key node identification method
CN114338696A (en) * 2022-03-14 2022-04-12 北京奥星贝斯科技有限公司 Method and device for distributed system
CN114363340A (en) * 2022-01-12 2022-04-15 东南大学 Unmanned aerial vehicle cluster failure control method and system and storage medium
CN114385088A (en) * 2022-01-19 2022-04-22 中山大学 Layout method for data correlation analysis in distributed storage system
CN114666336A (en) * 2022-03-14 2022-06-24 西安热工研究院有限公司 API gateway-based dynamic weight service routing method
CN115061815A (en) * 2022-06-20 2022-09-16 北京计算机技术及应用研究所 Optimal scheduling decision method and system based on AHP
CN115203177A (en) * 2022-09-16 2022-10-18 北京智阅网络科技有限公司 Distributed data storage system and storage method
CN116401111A (en) * 2023-05-26 2023-07-07 中国第一汽车股份有限公司 Function detection method and device of brain-computer interface, electronic equipment and storage medium
CN116991580A (en) * 2023-07-27 2023-11-03 上海沄熹科技有限公司 Distributed database system load balancing method and device
CN117033004A (en) * 2023-10-10 2023-11-10 苏州元脑智能科技有限公司 Load balancing method and device, electronic equipment and storage medium
CN117119058A (en) * 2023-10-23 2023-11-24 武汉吧哒科技股份有限公司 Storage node optimization method in Ceph distributed storage cluster and related equipment
CN117129556A (en) * 2023-08-29 2023-11-28 中国矿业大学 Indoor TVOC concentration real-time monitoring system based on wireless sensor network
CN117498399A (en) * 2023-12-29 2024-02-02 国网浙江省电力有限公司 Multi-energy collaborative configuration method and system considering elastic adjustable energy entity access

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704542A (en) * 2019-10-15 2020-01-17 南京莱斯网信技术研究院有限公司 Data dynamic partitioning system based on node load
CN111158918B (en) * 2019-12-31 2022-11-11 深圳大学 Supporting point parallel enumeration load balancing method, device, equipment and medium
CN111400045B (en) * 2020-03-16 2023-09-05 杭州海康威视系统技术有限公司 Load balancing method and device
CN111581500A (en) * 2020-04-24 2020-08-25 贵州力创科技发展有限公司 Network public opinion-oriented data distributed directional storage method and device
CN111709623A (en) * 2020-06-04 2020-09-25 中国科学院计算机网络信息中心 High-performance computing environment evaluation method and device, electronic equipment and storage medium
CN111813512B (en) * 2020-06-23 2022-11-25 重庆邮电大学 High-energy-efficiency Spark task scheduling method based on dynamic partition
CN111966289B (en) * 2020-08-13 2024-02-09 上海哔哩哔哩科技有限公司 Partition optimization method and system based on Kafka cluster
JP2022074864A (en) * 2020-11-05 2022-05-18 富士通株式会社 Information processor, control method of information processor, and control program of information processor
CN112395318B (en) * 2020-11-24 2022-10-04 福州大学 Distributed storage middleware based on HBase + Redis
CN115309538A (en) * 2021-05-08 2022-11-08 戴尔产品有限公司 Multi-index based workload balancing between storage resources
CN113626426B (en) * 2021-07-06 2022-06-14 佛山市禅城区政务服务数据管理局 Method and system for collecting and transmitting ecological grid data
CN114117545A (en) * 2021-11-08 2022-03-01 重庆邮电大学 Tamper-proof electronic certification system and implementation method thereof
CN114900525B (en) * 2022-05-20 2022-12-27 中国地质大学(北京) Double-layer cooperative load balancing method for skew data stream and storage medium
CN115242797B (en) * 2022-06-17 2023-10-27 西北大学 Micro-service architecture-oriented client load balancing method and system
WO2024007171A1 (en) * 2022-07-05 2024-01-11 北京小米移动软件有限公司 Computing power load balancing method and apparatuses
CN115080215B (en) * 2022-08-22 2022-11-15 中诚华隆计算机技术有限公司 Method and system for performing task scheduling among computing nodes by state monitoring chip
CN116595102B (en) * 2023-07-17 2023-10-17 法诺信息产业有限公司 Big data management method and system for improving clustering algorithm
CN117724928A (en) * 2023-12-15 2024-03-19 谷技数据(武汉)股份公司 Intelligent operation and maintenance visual monitoring method and system based on big data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298339A (en) * 2014-10-11 2015-01-21 东北大学 Server integration method oriented to minimum energy consumption
US10310760B1 (en) * 2018-05-21 2019-06-04 Pure Storage, Inc. Layering communication fabric protocols
US20190171438A1 (en) * 2017-12-05 2019-06-06 Archemy, Inc. Active adaptation of networked compute devices using vetted reusable software components
CN110704542A (en) * 2019-10-15 2020-01-17 南京莱斯网信技术研究院有限公司 Data dynamic partitioning system based on node load

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104978236B (en) * 2015-07-07 2018-11-06 四川大学 HDFS load source destination node choosing methods based on more measurement indexs
CN108628662A (en) * 2018-04-11 2018-10-09 武汉理工大学 Mix the resource elastic telescopic method based on load estimation under cloud environment
CN109783235A (en) * 2018-12-29 2019-05-21 西安交通大学 A kind of load equilibration scheduling method based on principle of maximum entropy

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298339A (en) * 2014-10-11 2015-01-21 东北大学 Server integration method oriented to minimum energy consumption
US20190171438A1 (en) * 2017-12-05 2019-06-06 Archemy, Inc. Active adaptation of networked compute devices using vetted reusable software components
US10310760B1 (en) * 2018-05-21 2019-06-04 Pure Storage, Inc. Layering communication fabric protocols
CN110704542A (en) * 2019-10-15 2020-01-17 南京莱斯网信技术研究院有限公司 Data dynamic partitioning system based on node load

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342618A (en) * 2021-06-30 2021-09-03 深圳前海微众银行股份有限公司 Distributed monitoring cluster management method, device and computer readable storage medium
CN113626282A (en) * 2021-07-16 2021-11-09 济南浪潮数据技术有限公司 Cloud computing physical node load monitoring method and device, terminal and storage medium
CN113626282B (en) * 2021-07-16 2023-12-22 济南浪潮数据技术有限公司 Cloud computing physical node load monitoring method, device, terminal and storage medium
CN113590319A (en) * 2021-07-28 2021-11-02 北京金山云网络技术有限公司 Computing resource load balancing method and device for message queue
CN113608870A (en) * 2021-07-28 2021-11-05 北京金山云网络技术有限公司 Load balancing method and device of message queue, electronic equipment and storage medium
CN113608876B (en) * 2021-08-12 2024-03-29 中国科学技术大学 Distributed file system metadata load balancing method based on load type perception
CN113608876A (en) * 2021-08-12 2021-11-05 中国科学技术大学 Distributed file system metadata load balancing method based on load type perception
CN113780852A (en) * 2021-09-16 2021-12-10 东北大学 Diagnosis method for quality defects in plate and strip rolling process
CN113780852B (en) * 2021-09-16 2024-03-05 东北大学 Diagnosis method for quality defects in plate and strip rolling process
CN113886081A (en) * 2021-09-29 2022-01-04 南京地铁建设有限责任公司 Station multi-face-brushing array face library segmentation method based on load balancing
CN113986557B (en) * 2021-11-15 2023-09-12 北京航空航天大学 Storage load balancing method and system for full-flow collection
CN113986557A (en) * 2021-11-15 2022-01-28 北京航空航天大学 Storage load balancing method and system for full-flow collection
CN114064281A (en) * 2021-11-22 2022-02-18 重庆邮电大学 Low-cost Spark actuator placement method based on BFD-VNS algorithm
CN114268547A (en) * 2021-12-09 2022-04-01 中国电子科技集团公司第五十四研究所 Multi-attribute decision-making air emergency communication network key node identification method
CN114201296A (en) * 2021-12-09 2022-03-18 厦门美亚亿安信息科技有限公司 Data balancing method and system based on streaming processing platform
CN114363340B (en) * 2022-01-12 2023-12-26 东南大学 Unmanned aerial vehicle cluster failure control method, system and storage medium
CN114363340A (en) * 2022-01-12 2022-04-15 东南大学 Unmanned aerial vehicle cluster failure control method and system and storage medium
CN114385088A (en) * 2022-01-19 2022-04-22 中山大学 Layout method for data correlation analysis in distributed storage system
CN114385088B (en) * 2022-01-19 2023-09-01 中山大学 Layout method after data relevance analysis in distributed storage system
CN114338696B (en) * 2022-03-14 2022-07-15 北京奥星贝斯科技有限公司 Method and device for distributed system
CN114666336A (en) * 2022-03-14 2022-06-24 西安热工研究院有限公司 API gateway-based dynamic weight service routing method
WO2023173917A1 (en) * 2022-03-14 2023-09-21 北京奥星贝斯科技有限公司 Method and apparatus for distributed system
CN114338696A (en) * 2022-03-14 2022-04-12 北京奥星贝斯科技有限公司 Method and device for distributed system
CN115061815A (en) * 2022-06-20 2022-09-16 北京计算机技术及应用研究所 Optimal scheduling decision method and system based on AHP
CN115061815B (en) * 2022-06-20 2024-03-26 北京计算机技术及应用研究所 AHP-based optimal scheduling decision method and system
CN115203177B (en) * 2022-09-16 2022-12-06 北京智阅网络科技有限公司 Distributed data storage system and storage method
CN115203177A (en) * 2022-09-16 2022-10-18 北京智阅网络科技有限公司 Distributed data storage system and storage method
CN116401111B (en) * 2023-05-26 2023-09-05 中国第一汽车股份有限公司 Function detection method and device of brain-computer interface, electronic equipment and storage medium
CN116401111A (en) * 2023-05-26 2023-07-07 中国第一汽车股份有限公司 Function detection method and device of brain-computer interface, electronic equipment and storage medium
CN116991580A (en) * 2023-07-27 2023-11-03 上海沄熹科技有限公司 Distributed database system load balancing method and device
CN117129556A (en) * 2023-08-29 2023-11-28 中国矿业大学 Indoor TVOC concentration real-time monitoring system based on wireless sensor network
CN117129556B (en) * 2023-08-29 2024-02-02 中国矿业大学 Indoor TVOC concentration real-time monitoring system based on wireless sensor network
CN117033004A (en) * 2023-10-10 2023-11-10 苏州元脑智能科技有限公司 Load balancing method and device, electronic equipment and storage medium
CN117033004B (en) * 2023-10-10 2024-02-09 苏州元脑智能科技有限公司 Load balancing method and device, electronic equipment and storage medium
CN117119058A (en) * 2023-10-23 2023-11-24 武汉吧哒科技股份有限公司 Storage node optimization method in Ceph distributed storage cluster and related equipment
CN117119058B (en) * 2023-10-23 2024-01-19 武汉吧哒科技股份有限公司 Storage node optimization method in Ceph distributed storage cluster and related equipment
CN117498399A (en) * 2023-12-29 2024-02-02 国网浙江省电力有限公司 Multi-energy collaborative configuration method and system considering elastic adjustable energy entity access
CN117498399B (en) * 2023-12-29 2024-03-08 国网浙江省电力有限公司 Multi-energy collaborative configuration method and system considering elastic adjustable energy entity access

Also Published As

Publication number Publication date
CN110704542A (en) 2020-01-17

Similar Documents

Publication Publication Date Title
WO2021073083A1 (en) Node load-based dynamic data partitioning system
CN104283946B (en) The resource-adaptive adjustment system and method for multi-dummy machine under a kind of single physical machine
CN104298339B (en) Server integration method oriented to minimum energy consumption
CN104978236B (en) HDFS load source destination node choosing methods based on more measurement indexs
CN104123600A (en) Electrical manager&#39;s index forecasting method for typical industry big data
WO2023103349A1 (en) Load adjustment method, management node, and storage medium
CN104298550A (en) Hadoop-oriented dynamic scheduling method
CN106066423A (en) A kind of analysis method of opposing electricity-stealing based on Loss allocation suspicion analysis
CN103294546A (en) Multi-dimensional resource performance interference aware on-line virtual machine migration method and system
CN105160149A (en) Method for constructing demand response scheduling evaluation system of simulated peak-shaving unit
CN109103874A (en) Consider the distribution network reliability evaluation method of part throttle characteristics and distributed generation resource access
CN110109971A (en) A kind of low-voltage platform area user power utilization Load Characteristic Analysis method
CN112288328A (en) Energy internet risk assessment method based on gray chromatography
CN114139940A (en) Generalized demand side resource network load interaction level assessment method based on combined empowerment-cloud model
Dezhabad et al. Cloud workload characterization and profiling for resource allocation
CN105393518B (en) Distributed cache control method and device
CN111507565A (en) Performance evaluation method and system of energy storage power station in frequency modulation application scene
CN106874607B (en) Power grid self-organization critical state quantitative evaluation method based on multi-level variable weight theory
CN116090893A (en) Control method and system for comprehensive energy participation auxiliary service of multiple parks
CN109657967A (en) A kind of confirmation method and system of Transmission Expansion Planning in Electric evaluating indexesto scheme weight
CN114844048A (en) Power grid regulation and control demand-oriented adjustable load regulation capacity evaluation method
CN108092828A (en) A kind of dynamic Service providing method, device and program
Lu et al. Evaluation of black-start schemes based on prospect theory and improved TOPSIS method
Wu et al. A dynamic resource-aware endorsement strategy for improving throughput in blockchain systems
CN112653121A (en) Method and device for evaluating power grid frequency modulation participation capability of new energy microgrid

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20875852

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20875852

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20875852

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.01.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20875852

Country of ref document: EP

Kind code of ref document: A1