CN116599858A - Response time guarantee type cluster system and scale adjustment method thereof - Google Patents

Response time guarantee type cluster system and scale adjustment method thereof Download PDF

Info

Publication number
CN116599858A
CN116599858A CN202310574528.9A CN202310574528A CN116599858A CN 116599858 A CN116599858 A CN 116599858A CN 202310574528 A CN202310574528 A CN 202310574528A CN 116599858 A CN116599858 A CN 116599858A
Authority
CN
China
Prior art keywords
cluster
load
service quality
cov
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310574528.9A
Other languages
Chinese (zh)
Inventor
胡程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Foreign Studies
Original Assignee
Guangdong University of Foreign Studies
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Foreign Studies filed Critical Guangdong University of Foreign Studies
Priority to CN202310574528.9A priority Critical patent/CN116599858A/en
Publication of CN116599858A publication Critical patent/CN116599858A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • H04L43/55Testing of service level quality, e.g. simulating service usage
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Quality & Reliability (AREA)
  • Environmental & Geological Engineering (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Algebra (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a response time guarantee type cluster system and a scale adjustment method thereof, which accurately reflect the cluster situation by adopting a circulation queue coverage type statistics record and calculating sampling values of five characteristics of loads in preset time, evaluate the service quality of the cluster system based on average response time, take each characteristic value of the specific loads and the actual provided value of the cluster scale when the service quality level of the cluster system is in a desired range as samples, take the change association degree between each characteristic of the loads and the service quality of the cluster system as important basis of judging the characteristics, generate a threshold decision tree, enable the generated threshold decision tree to more reflect the connection between the loads and the service quality of the system, further realize the fast acquisition of the cluster scale matched with the current load according to each characteristic value of the current loads, and timely adjust the cluster scale of the system to be in a proper size.

Description

Response time guarantee type cluster system and scale adjustment method thereof
Technical Field
The invention relates to the field of computer system structures, in particular to a response time guarantee type cluster system and a scale adjustment method thereof.
Background
The cluster system is widely used for various cloud and fog computing, and because of variable load, the scale of the cluster is usually designed in a telescopic way so as to ensure the energy efficiency of the system. However, when the cluster size is adjusted, the energy efficiency cannot be guaranteed and the service quality of the system cannot be ignored. The cluster scale should match the service quality requirement of the current load, otherwise, the scale is too small to ensure the service quality, and the service resource waste caused by the excessive scale can reduce the energy efficiency of the system. The load is typically constantly changing and the adjustment to the cluster size should be fast and accurate. The traditional reactive cluster scale adjustment method does not evaluate how large the cluster scale matches the load demand, so when the actual service quality deviates from the expected standard, the cluster scale can be adjusted only by gradually trying, for example, increasing or decreasing the number of 1 cluster working nodes each time, and the adjustment time is delayed by gradually trying, so that the service quality and the energy efficiency of the system are difficult to keep at a good level. To compensate for this deficiency, the recent approach is to take into account a large number of parameters to evaluate the best cluster size with a defined model, but since the actual load varies continuously, not only in terms of its strength, but also in terms of its resource requirements, such as space requirements, calculation requirements, etc., the defined model may be relatively accurate for a short period of time, but as the load varies, it is likely that it is no longer suitable for the condition of the subsequent load.
Therefore, it is important to dynamically and continuously quickly adjust the cluster size to a proper size according to the actual load situation. It should be noted that if the cluster size cannot be adjusted to a suitable size in time, after adjustment, if the size is too small, the service quality of the system is seriously affected, and if the size is too large, it is difficult to guarantee the energy efficiency level of the system. Thus, it is critical to accurately assess the current load demand on the system scale. In practice, it is not possible to achieve a completely accurate evaluation, but the closer the evaluation result is to the actual requirement value, the smaller the fine tuning cost, and the better the quality of service and energy efficiency level of the system can be ensured. In the prior art, the requirement is evaluated by pursuing an extremely-high mathematical model, but because the operation design factors of an actual system are numerous and difficult to comprehensively grasp, under the condition of general load (the use mode of the load to the resource is changeable), the extremely-high mathematical model is easy to generate large deviation, and in addition, the resource cost caused by the data required by the extremely-high mathematical model during acquisition additionally increases the system burden.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a response time guarantee type cluster system and a scale adjustment method thereof, which are used for efficiently monitoring load conditions, dynamically and continuously determining cluster scale and adjusting the system cluster scale to be proper.
A response time guarantee type cluster system and a scale adjustment method thereof, wherein the response time guarantee type cluster scale adjustment method comprises the following steps: s1, continuously monitoring the load condition through a load detection module, adopting a circulating queue to count and record, and calculating the task bringing rate R in the load within a preset time ar Coefficient of variation CoV for task arrival intervals ar Average response time T of completed load task rt Coefficient of variation CoV for task service time st And average task service rate R of cluster working nodes sr A sampled value of the feature of the five aspects; s2, the average response time T of the completed load task in the preset time is calculated through the service quality monitoring module rt With which a preset response time criterion E is established rt Comparing, and evaluating the service quality condition of the cluster system; s3, continuously analyzing sampling values R of various characteristics of the load through a threshold setting and adjusting module ar 、CoV ar 、T rt 、CoV st 、R sr The method comprises the steps of (1) changing association degrees with the service quality of a cluster system, and then sequentially setting initial thresholds of various load characteristics according to the sequence from big to small of the association degrees; s4, adjusting the scale of the cluster system through a scale adjustment module, wherein the specific steps include S41, the scale adjustment module obtains the service quality condition of the current cluster system from a service quality monitoring module; s42, if the service quality level of the cluster system is within the expected rangeWhen the load is in the circle, the threshold setting and adjusting module acquires the sampling value of each characteristic of the current load and the current cluster scale value and takes the sampling value and the current cluster scale value as a sample collection record, after the quantitative samples are accumulated, the load characteristic with the largest degree of association with the service quality of the cluster system in the samples is taken as a division basis, the optimal division value is determined according to the information gain maximization principle, the initial threshold is adjusted, and a threshold decision tree is generated; s43, if the service quality level is not in the expected range, adopting a threshold decision tree generated by a threshold setting and adjusting module, comparing each characteristic sampling value of the current load with each adjusted initial threshold one by one, determining the optimal cluster scale suitable for the current load condition, switching cluster service node states in a working state and an idle state in a state switching mode, and adjusting cluster working nodes into corresponding numbers; s5, receiving the load through the task distribution module, selecting the cluster working node with the highest task service rate each time in a first-come first-serve mode according to the task service rate condition of each cluster working node obtained by the load detection module, distributing the tasks one by one, processing the received tasks by the cluster working node, and returning a response to the user terminal according to actual needs after the processing is finished.
Preferably, in step S1, the circular queue uses an overlay memory record, and the load detection module monitors R ar With CoV ar When the method is used, a circular queue A is adopted, the arrival time of the arrival task is recorded according to the time sequence, and R is calculated according to the record in the circular queue A at the preset statistical time ar With CoV ar The method comprises the steps of carrying out a first treatment on the surface of the In monitoring CoV st And R is R sr When in use, the circulation queue B is adopted, the service time of the completed task is recorded according to the time sequence, and the circulation queue B is used for counting time in a preset mode B Calculation of CoV from records in a memory st And R is R sr
Preferably, the specific steps of the step S2 are as follows: s21, setting a circulation queue C for each cluster working node through a service quality monitoring module; s22, when each cluster working node completes one task, recording the response time of the task to the circulation sequence of the corresponding cluster working nodeAmong the columns; s23, traversing all the circular queue records after the preset duration and calculating the average response time T of the completed load task rt Will T rt And comparing with a preset response time standard, and evaluating the service quality condition of the cluster system.
Preferably, R in the step S3 ar 、CoV ar 、T rt 、CoV st 、R sr The judgment of the degree of the change association with the service quality of the cluster system is specifically as follows: s31, the threshold setting and adjusting module acquires sampling values of all the characteristics of the load in a continuous time period before the latest sample, wherein the sampling values specifically comprise R ar 、CoV ar 、T rt 、CoV st 、R sr The method comprises the steps of carrying out a first treatment on the surface of the S32, using the sampling value of the load characteristic of each time period as analysis statistical data, and respectively calculating the service quality relative rate P rq P rq And R is R ar Is of the correlation coefficient of (2)P rq With CoV ar Correlation coefficient of->P rq And T is rt Correlation coefficient of->P rq With CoV st Correlation coefficient of->And P rq And R is R sr Correlation coefficient of->Etc., wherein->In E rt Is a preset response time standard; s33, correlation coefficient value->And->The magnitude of the variable association degree between each characteristic of the judging load and the service quality of the cluster system is correspondingly determined.
Preferably, the quality of service level not being within the desired range in step S43 comprises two cases, one being the average response time T rt Greater than a preset response time criterion E rt The service quality of the cluster system cannot be guaranteed; is the average response time T rt Far less than the preset response time criterion E rt The service quality of the cluster system is over-optimal.
The invention also provides a response time guarantee type cluster system, which comprises a cluster manager and a cluster service node, wherein the cluster manager comprises a load detection module, a service quality monitoring module, a threshold value setting and adjusting module, a scale adjusting module and a task distribution module; the cluster service node comprises a cluster working node in a working state and a cluster idle node in an idle state; the load detection module is used for continuously monitoring load conditions, adopting a circulating queue to count and record and calculating a task bringing rate R in a load within a preset time ar Coefficient of variation CoV for task arrival intervals ar Average response time T of completed load task rt Coefficient of variation CoV for task service time st And average task service rate R of cluster working nodes sr A sampled value of the feature of the five aspects; the service quality monitoring module is used for determining the average response time T of the completed load task within the preset time rt With which a preset response time criterion E is established rt Comparing, and evaluating the service quality condition of the cluster system; the threshold setting and adjusting module is used for continuously analyzing sampling values R of various characteristics of the load ar 、CoV ar 、T rt 、CoV st 、R sr The degree of the change association with the service quality of the cluster system, and then the initial stage of each characteristic of the load is set in sequence according to the order of the association degree from large to smallA start threshold; the scale adjustment module is used for acquiring the service quality condition of the current cluster system from the service quality monitoring module, if the service quality level of the cluster system is in a desired range, the threshold setting adjustment module acquires the sampling value of each characteristic of the current load and the current cluster scale value and takes the sampling value and the current cluster scale value as a sample collection record, after a quantitative sample is accumulated, the load characteristic with the largest degree of change association with the service quality of the cluster system in the sample is taken as a division basis, the optimal division value is determined by an information gain maximization principle, an initial threshold is adjusted, and a threshold decision tree is generated; if the service quality level is not in the expected range, comparing each characteristic sampling value of the current load with each adjusted initial threshold value one by adopting the threshold decision tree, determining the optimal cluster scale suitable for the current load condition, switching the service node states in the working state and the idle state in a state switching mode, and adjusting the cluster working nodes into corresponding numbers; the task distribution module is used for receiving the load, selecting the cluster working node with the highest task service rate each time in a first-come first-serve mode according to the task service rate condition of each cluster working node obtained by the load detection module, distributing the tasks one by one, processing the received tasks by the cluster working node, and returning a response to the user terminal according to actual needs after the processing is finished.
Further, the circulation queue adopts a covered storage record, the load detection module adopts a circulation queue A, records the arrival time of the arrival task according to the time sequence, and calculates R according to the record in the circulation queue A in the preset statistical time ar With CoV ar The method comprises the steps of carrying out a first treatment on the surface of the Simultaneously adopting a circular queue B, recording the service time of the completed tasks according to the time sequence, and calculating the CoV according to the record in the circular queue B in the preset statistical time st And R is R sr
Further, the manner in which the quality of service monitoring module evaluates the quality of service condition of the cluster system is specifically: setting a circulation queue C for each cluster working node; when each cluster working node completes one task, the response time of the task is recordedRecording the cyclic sequence C of the corresponding cluster working node; traversing all the circulating queues C after the preset duration and calculating the average response time T of the completed load task rt Will T rt And comparing with a preset response time standard, and evaluating the service quality condition of the cluster system.
Further, the threshold setting adjustment module determines R ar 、CoV ar 、T rt 、CoV st 、R sr The judging mode of the variation association degree with the service quality of the cluster system specifically comprises the following steps: acquiring sampling values of each characteristic of the load in a continuous time period before the latest sample, wherein the sampling values comprise R ar 、CoV ar 、T rt 、CoV st 、R sr The method comprises the steps of carrying out a first treatment on the surface of the The sampling value of the load characteristic of each time period is used as analysis statistical data to calculate the relative rate P of the service quality rq P rq And R is R ar Is of the correlation coefficient of (2)P rq With CoV ar Correlation coefficient of->P rq And T is rt Correlation coefficient of->P rq With CoV st Correlation coefficient of->And P rq And R is R sr Correlation coefficient of->Etc., wherein->In E rt Is a preset response time standard; the obtained correlation coefficient value->And->The magnitude of the variable association degree between each characteristic of the judging load and the service quality of the cluster system is correspondingly determined.
The beneficial effects of the invention are as follows: the invention provides a response time guarantee type cluster system and a scale adjustment method thereof, which accurately reflect the cluster situation by adopting a circulation queue coverage type statistics record and calculating sampling values of five characteristics of loads in preset time, evaluate the service quality of the cluster system based on average response time, take each characteristic value of the specific loads and the actual provided value of the cluster scale when the service quality level of the cluster system is in a desired range as samples, take the change association degree between each characteristic of the loads and the service quality of the cluster system as important basis of judging the characteristics, generate a threshold decision tree, enable the generated threshold decision tree to more reflect the connection between the loads and the service quality of the system, further realize the fast acquisition of the cluster scale matched with the current load according to each characteristic value of the current loads, and timely adjust the cluster scale of the system to be in a proper size.
Drawings
FIG. 1 is a workflow diagram of a response time guaranteed cluster scaling method provided by the present invention;
FIG. 2 is a schematic diagram of a response time guaranteed cluster system according to the present invention;
FIG. 3 is a timing flow chart of a method for scale adjustment of a response time guaranteed cluster provided by the present invention;
FIG. 4 is a schematic diagram of an example of a threshold decision tree according to an embodiment of the present invention.
Drawing reference numerals
1. A cluster manager; 11. a load detection module; 12. a quality of service monitoring module; 13. a threshold setting and adjusting module; 14. a scale adjustment module; 15. a task distribution module; 2. a cluster service node; 21. a cluster working node; 22. the cluster is free of nodes.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1 and 2, the present invention provides a response time guarantee type cluster system and a scale adjustment method thereof.
Specifically, a response time guarantee type cluster system comprises a cluster manager 1 and a cluster service node 2, wherein the cluster manager comprises a load detection module 11, a quality of service monitoring module 12, a threshold setting adjustment module 13, a scale adjustment module 14 and a task distribution module 15; the cluster service node 2 comprises a cluster working node 21 in a working state and a cluster idle node 22 in an idle state; the cluster working node 21 in the working state can provide normal service, and the cluster service node 2 in the idle state refers to the cluster idle node 22 in the standby, dormant or off state, which does not provide any service, and the more the number, the lower the system power consumption. The users of the cluster system can be intranet users or extranet users, and send out work tasks, namely user loads, through corresponding networks. After receiving the user load, the cluster manager 1 selects the cluster working node 21 with the lightest load state to provide service for the load, and in addition, the manager 1 is also responsible for overall management of system resources and detection of system states, so that the manager 1 is maintained in a normal working state for a long time. When the user load drops, after some cluster working nodes 21 are marked by the manager 1, the cluster working nodes enter an idle state after the original load is completed, and when the user load rises, the manager 1 calls some cluster idle nodes 22 in the idle state to increase the number of cluster working nodes 21 for providing services.
Wherein the load detection module 11 is used for continuously detecting the loadMonitoring load condition, adopting a circulating queue to count and record and calculate task bringing rate R in load within preset time ar Coefficient of variation CoV for task arrival intervals ar Average response time T of completed load task rt Coefficient of variation CoV for task service time st And average task service rate R of cluster working nodes sr Sample values of the features of the five aspects.
Wherein the load detection module 11 adopts a circular queue A, records the arrival time of the arrival task according to the time sequence, and calculates R according to the record in the circular queue A at the preset statistical time ar With CoV ar The method comprises the steps of carrying out a first treatment on the surface of the Simultaneously adopting a circular queue B, recording the service time of the completed tasks according to the time sequence, and calculating the CoV according to the record in the circular queue B in the preset statistical time st And R is R sr
The quality of service monitoring module 12 is configured to determine an average response time T of the completed load tasks within a predetermined time rt With which a preset response time criterion E is established rt And comparing, and evaluating the service quality condition of the cluster system.
The manner in which the qos monitoring module 12 evaluates the qos status of the trunking system is specifically: setting a circulation queue C for each cluster work node 21; when each cluster work node 21 completes one task, recording the response time of the task into a circulation sequence C of the corresponding cluster work node 21; traversing all the circulating queues C after the preset duration and calculating the average response time T of the completed load task rt Will T rt And comparing with a preset response time standard, and evaluating the service quality condition of the cluster system.
The threshold setting adjustment module 13 is used for continuously analyzing the sampling value R of each characteristic of the load ar 、CoV ar 、T rt 、CoV st 、R sr And then sequentially setting initial thresholds of all the characteristics of the load according to the sequence from the high degree of association to the low degree of association.
Wherein said at least one ofThe threshold setting adjustment module 13 determines R ar 、CoV ar 、T rt 、CoV st 、R sr The judging mode of the variation association degree with the service quality of the cluster system specifically comprises the following steps: acquiring sampling values of each characteristic of the load in a continuous time period before the latest sample, wherein the sampling values comprise R ar 、CoV ar 、T rt 、CoV st 、R sr The method comprises the steps of carrying out a first treatment on the surface of the The sampling value of the load characteristic of each time period is used as analysis statistical data to calculate the relative rate P of the service quality rq P rq And R is R ar Is of the correlation coefficient of (2)P rq With CoV ar Correlation coefficient of->P rq And T is rt Correlation coefficient of->P rq With CoV st Correlation coefficient of->And P rq And R is R sr Correlation coefficient of->Etc., wherein->In E rt Is a preset response time standard; the obtained correlation coefficient valueAnd->The magnitude of the variable association degree between each characteristic of the judging load and the service quality of the cluster system is correspondingly determined.
The scale adjustment module 14 is configured to obtain a current cluster system service quality condition from the service quality monitoring module 112, if the cluster system service quality level is within a desired range, the threshold setting adjustment module 13 obtains a sampling value of each feature of a current load and a current cluster scale value and uses the sampling value and the current cluster scale value as a sample collection record, after accumulating quantitative samples, determines an optimal division value according to an information gain maximization principle by taking a load feature with the greatest degree of change association with the cluster system service quality in the samples as a division basis, and adjusts an initial threshold value to generate a threshold decision tree; if the service quality level is not in the expected range, the threshold decision tree is adopted, each characteristic sampling value of the current load is compared with each adjusted initial threshold one by one, the optimal cluster scale suitable for the current load condition is determined, the states of the service nodes 2 in the working state and the idle state are switched in a state switching mode, and the cluster working nodes 21 are adjusted to be in corresponding numbers.
The task distribution module 15 is configured to receive the load, select, in a first-come first-serve manner, a cluster working node 21 with a highest task service rate each time according to the task service rate condition of each cluster working node 21 obtained by the load detection module 11, distribute the tasks one by one, and process the received tasks by the cluster working node 21 and return a response to the user terminal according to actual needs after the processing is completed.
Referring to fig. 1 and 3, the response time guaranteed cluster scale adjustment method includes steps S1 to S6.
Specifically, S1, continuously monitor the load condition by the load detection module 11, and use the statistical record of the circulation queue to calculate the task bringing rate R in the load within the preset time ar Coefficient of variation CoV for task arrival intervals ar Average response time T of completed load task rt Coefficient of variation CoV for task service time st And average task service rate R of cluster working nodes sr A sampled value of the feature of the five aspects; r is selected for ar 、CoV ar 、T rt 、CoV st 、R sr Five negative termsThe load characteristics can fully reflect the intensity and the variability of the load and the state condition of the cluster system, and effectively and accurately reflect the cluster system condition.
In step S1, the circular queue uses a covered storage record, and the load detection module 11 monitors R ar With CoV ar When the method is used, a circular queue A is adopted, the arrival time of the arrival task is recorded according to the time sequence, and R is calculated according to the record in the circular queue A at the preset statistical time ar With CoV ar The method comprises the steps of carrying out a first treatment on the surface of the In monitoring CoV st And R is R sr When in use, a circular queue B is adopted, the service time of the completed task is recorded according to the time sequence, and the CoV is calculated according to the record in the circular queue B in the preset statistical time st And R is R sr
S2, the average response time T of the completed load task in the preset time is calculated by the service quality monitoring module 12 rt With which a preset response time criterion E is established rt And comparing, and evaluating the service quality condition of the cluster system.
The specific steps of the step S2 are as follows:
s21, setting a circulation queue C for each cluster work node 21 through the service quality monitoring module 12;
s22, when each cluster working node 21 completes one task, recording the response time of the task into a circulation sequence of the corresponding cluster working node 21;
s23, traversing all the circular queue records after the preset duration and calculating the average response time T of the completed load task rt Will T rt And comparing with a preset response time standard, and evaluating the service quality condition of the cluster system.
When the average response time T rt Much smaller or larger than a preset response time criterion E rt The service quality level of the cluster system is not in the expected range; when the average response time T rt Slightly less than or equal to a preset response time criterion E rt It is indicated that the clustered system quality of service level is within the desired range.
S3, continuously analyzing the sampling value R of each characteristic of the load through the threshold setting and adjusting module 13 ar 、CoV ar 、T rt 、CoV st 、R sr And then sequentially setting initial thresholds of all the characteristics of the load according to the sequence from the high degree of association to the low degree of association.
Wherein R in the step S3 ar 、CoV ar 、T rt 、CoV st 、R sr The judgment of the degree of the change association with the service quality of the cluster system is specifically as follows:
s31, the threshold setting and adjusting module 13 obtains sampling values of each load characteristic in a continuous time period before the latest sample, specifically including R ar 、CoV ar 、T rt 、CoV st 、R sr
S32, using the sampling value of the load characteristic of each time period as analysis statistical data, and respectively calculating the service quality relative rate P rq P rq And R is R ar Is of the correlation coefficient of (2)P rq With CoV ar Correlation coefficient of->P rq And T is rt Is of the correlation coefficient of (2)P rq With CoV st Correlation coefficient of->And P rq And R is R sr Correlation coefficient of->Etc., wherein->In E rt Is a preset response time standard.
S33, correlation coefficient valueAnd->The magnitude of the variable association degree between each characteristic of the judging load and the service quality of the cluster system is correspondingly determined.
S4, adjusting the scale of the cluster system through a scale adjustment module 14, wherein the method comprises the following specific steps of
S41, the scale adjustment module 14 acquires the service quality condition of the current cluster system from the service quality monitoring module 12;
s42, if the service quality level of the cluster system is within the expected range, the threshold setting and adjusting module 13 acquires the sampling value of each characteristic of the current load and the current cluster scale value and takes the sampling value and the current cluster scale value as a sample collection record, after the quantitative samples are accumulated, the load characteristic with the largest degree of association with the service quality of the cluster system in the samples is taken as a division basis, the optimal division value is determined according to the information gain maximization principle, and the initial threshold is gradually adjusted in the gradual division of the load characteristics, so as to generate a threshold decision tree.
Wherein the information gainx represents a certain load characteristic as a division basis, S represents a sample to be divided (|s| represents the number of samples it contains), S i Is the i-th sample subset (|s) after the division i I is the number of samples it contains), v represents the partitioned subset S i Number of (A)>Y represents the number of types contained in sample S (e.g., the cluster size involved in the sample is three of 10%, 50%, 70%, i.e., y may take on these three values, then y is 3), p k Representing samples belonging to the kth type (i.e. a particular y-value phaseEqual) to the total number of samples, and after each division, recursively performing all steps on each divided sample subset respectively until all samples are divided.
And S43, if the service quality level is not in the expected range, comparing each characteristic sampling value of the current load with each initial threshold value after adjustment one by adopting a threshold decision tree generated by the threshold setting adjustment module 13, determining the optimal cluster scale suitable for the current load condition, and adjusting the cluster service nodes 21 to the corresponding number by the cluster service nodes 2 through a state switching mode.
The scaling module 14 is responsible for scaling the cluster work nodes 21 to a corresponding number according to the determined cluster scale value, thereby providing the cluster system with a number of cluster work nodes 21 matching the load. And for the initial stage of not generating the decision tree, when the service quality level of the cluster system is not in a desired range, comparing the sampling value of each characteristic of the current load with each initial threshold value set by the threshold value setting and adjusting module 13 one by one, increasing or reducing the number of cluster working nodes 21 each time, and gradually adjusting the cluster scale. After the threshold decision tree is generated, when the service quality level of the current cluster system is not in the expected range, starting from the root of the threshold decision tree by using the sampling value of each characteristic of the current load through the scale adjustment module 14, sequentially descending according to each division value until the leaf node is reached, finally adjusting the cluster scale to the scale recorded by the leaf node, and adjusting the cluster working nodes 21 to the corresponding number through the state switching mode by the cluster service node 2.
Wherein the quality of service level not in the desired range in step S43 includes two cases, one is the average response time T rt Greater than a preset response time criterion E rt The service quality of the cluster system cannot be guaranteed; is the average response time T rt Far less than the preset response time criterion E rt The service quality of the cluster system is over-optimal.
S5, receiving the load through the task distribution module 15, selecting the cluster working node 21 with the highest task service rate each time in a first-come first-serve mode according to the task service rate condition of each cluster working node 21 obtained by the load detection module 11, distributing the tasks one by one, processing the received tasks by the working node 21, and returning a response to the user terminal according to actual needs after the processing is finished.
Referring to fig. 4, fig. 4 is a schematic diagram of an example structure of a threshold decision tree according to the present embodiment, constructed by the above step S42, in which the sampled values R of the five characteristics of the load ar 、T rt 、R sr 、CoV st 、CoV ar The method comprises the following steps of: 135. 183ms, 110, 30, 22, the threshold decision tree shown in fig. 4 can ultimately determine that the cluster size needs to be adjusted to 70%.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (9)

1. The response time guarantee type cluster scale adjustment method is characterized by comprising the following steps of:
s1, continuously monitoring the load condition through a load detection module, adopting a circulating queue to count and record, and calculating the task bringing rate R in the load within a preset time ar Coefficient of variation CoV for task arrival intervals ar Average response time T of completed load task rt Coefficient of variation CoV for task service time st And average task service rate R of cluster working nodes sr A sampled value of the feature of the five aspects;
s2, the average response time T of the completed load task in the preset time is calculated through the service quality monitoring module rt With which a preset response time criterion E is established rt Comparing, and evaluating the service quality condition of the cluster system;
s3, continuously analyzing sampling values R of various characteristics of the load through a threshold setting and adjusting module ar 、CoV ar 、T rt 、CoV st 、R sr The method comprises the steps of (1) changing association degrees with the service quality of a cluster system, and then sequentially setting initial thresholds of various load characteristics according to the sequence from big to small of the association degrees;
s4, adjusting the scale of the cluster system through a scale adjustment module, wherein the method comprises the following specific steps of
S41, the scale adjustment module acquires the service quality condition of the current cluster system from the service quality monitoring module;
s42, if the service quality level of the cluster system is within the expected range, the threshold setting and adjusting module acquires the sampling value of each characteristic of the current load and the current cluster scale value and takes the sampling value and the current cluster scale value as a sample collection record, after a quantitative sample is accumulated, the load characteristic with the largest degree of change association with the service quality of the cluster system in the sample is taken as a division basis, an optimal division value is determined according to an information gain maximization principle, an initial threshold is adjusted, and a threshold decision tree is generated;
s43, if the service quality level is not in the expected range, comparing each characteristic sampling value of the current load with each adjusted initial threshold value one by adopting the threshold decision tree generated in the step S43, determining the optimal cluster scale suitable for the current load condition, switching cluster service node states in a working state and an idle state in a state switching mode, and adjusting the cluster working nodes to be in corresponding numbers;
s5, receiving the load through the task distribution module, selecting the cluster working node with the highest task service rate each time in a first-come first-serve mode according to the task service rate condition of each cluster working node obtained by the load detection module, distributing the tasks one by one, processing the received tasks by the cluster working node, and returning a response to the user terminal according to actual needs after the processing is finished.
2. The method for adjusting a cluster size according to claim 1, wherein in step S1, the circular queue uses a storage record with a coverage, and the load detection module monitors R ar With CoV ar When using a cyclic queueA step A of recording the arrival time of the arrival task according to the time sequence and calculating R according to the record in the circular queue A at the preset statistical time ar With CoV ar The method comprises the steps of carrying out a first treatment on the surface of the In monitoring CoV st And R is R sr When in use, a circular queue B is adopted, the service time of the completed task is recorded according to the time sequence, and the CoV is calculated according to the record in the circular queue B in the preset statistical time st And R is R sr
3. The method for adjusting the scale of a cluster with guaranteed response time according to claim 1, wherein the step S2 specifically comprises the following steps:
s21, setting a circulation queue C for each cluster working node through a service quality monitoring module;
s22, when each cluster working node completes one task, recording the response time of the task into a circulation sequence C of the corresponding cluster working node;
s23, traversing all the circular queues C after the preset time length, and calculating the average response time T of the completed load task rt Will T rt And comparing with a preset response time standard, and evaluating the service quality condition of the cluster system.
4. The method for adjusting the cluster size according to claim 1, wherein R in step S3 ar 、CoV ar 、T rt 、CoV st 、R sr The judging step of the changing association degree with the service quality of the cluster system comprises the following steps:
s31, the threshold setting and adjusting module acquires sampling values of all the characteristics of the load in a continuous time period before the latest sample, wherein the sampling values specifically comprise R ar 、CoV ar 、T rt 、CoV st 、R sr
S32, using the sampling value of the load characteristic of each time period as analysis statistical data, and respectively calculating the service quality relative rate P rq P rq And R is R ar Is of the correlation coefficient of (2)P rq With CoV ar Correlation coefficient of->P rq And T is rt Correlation coefficient of->P rq With CoV st Correlation coefficient of->And P rq And R is R sr Correlation coefficient of->Etc., wherein->In E rt Is a preset response time standard;
s33, correlation coefficient valueAnd->The magnitude of the variable association degree between each characteristic of the judging load and the service quality of the cluster system is correspondingly determined.
5. The method for adjusting the cluster size according to claim 1, wherein the step S43 of not having the quality of service level within the desired range includes two cases, one is an average response time T rt Greater than a preset response time criterion E rt The service quality of the cluster system cannot be guaranteed; is the average response time T rt Far less than presetResponse time criterion E rt The service quality of the cluster system is over-optimal.
6. The response time guarantee type cluster system comprises a cluster manager and a cluster service node, and is characterized in that the cluster manager comprises a load detection module, a service quality monitoring module, a threshold value setting and adjusting module, a scale adjusting module and a task distribution module; the cluster service node comprises a cluster working node in a working state and a cluster idle node in an idle state;
the load detection module is used for continuously monitoring load conditions, adopting a circulating queue to count and record and calculating a task bringing rate R in a load within a preset time ar Coefficient of variation CoV for task arrival intervals ar Average response time T of completed load task rt Coefficient of variation CoV for task service time st And average task service rate R of cluster working nodes sr A sampled value of the feature of the five aspects;
the service quality monitoring module is used for determining the average response time T of the completed load task within the preset time rt With which a preset response time criterion E is established rt Comparing, and evaluating the service quality condition of the cluster system;
the threshold setting and adjusting module is used for continuously analyzing sampling values R of various characteristics of the load ar 、CoV ar 、T rt 、CoV st 、R sr The method comprises the steps of (1) changing association degrees with the service quality of a cluster system, and then sequentially setting initial thresholds of various load characteristics according to the sequence from big to small of the association degrees;
the scale adjustment module is used for acquiring the service quality condition of the current cluster system from the service quality monitoring module, if the service quality level of the cluster system is in a desired range, the threshold setting adjustment module acquires the sampling value of each characteristic of the current load and the current cluster scale value and takes the sampling value and the current cluster scale value as a sample collection record, after a quantitative sample is accumulated, the load characteristic with the largest degree of change association with the service quality of the cluster system in the sample is taken as a division basis, the optimal division value is determined by an information gain maximization principle, an initial threshold is adjusted, and a threshold decision tree is generated; if the service quality level is not in the expected range, comparing each characteristic sampling value of the current load with each adjusted initial threshold value one by adopting the threshold decision tree, determining the optimal cluster scale suitable for the current load condition, switching cluster service node states in a working state and an idle state in a state switching mode, and adjusting the cluster working nodes into corresponding numbers;
the task distribution module is used for receiving the load, selecting the cluster working node with the highest task service rate each time in a first-come first-serve mode according to the task service rate condition of each cluster working node obtained by the load detection module, distributing the tasks one by one, processing the received tasks by the cluster working node, and returning a response to the user terminal according to actual needs after the processing is finished.
7. The response time guaranteed cluster system according to claim 6, wherein the circulation queue uses a covered storage record, the load detection module uses a circulation queue a to record arrival time of arrival tasks according to time sequence, and calculates R according to the record in the circulation queue a at a preset statistical time ar With CoV ar The method comprises the steps of carrying out a first treatment on the surface of the Simultaneously adopting a circular queue B, recording the service time of the completed tasks according to the time sequence, and calculating the CoV according to the record in the circular queue B in the preset statistical time st And R is R sr
8. The response time guarantee type cluster system according to claim 6, wherein the manner in which the service quality monitoring module evaluates the service quality status of the cluster system is specifically as follows:
setting a circulation queue C for each cluster working node; when each cluster working node completes one task, recording the response time of the task into a circulation sequence C of the corresponding cluster working node; traversing all the circulating queue C records after the preset durationAnd calculate the average response time T of the completed load task rt Will T rt And comparing with a preset response time standard, and evaluating the service quality condition of the cluster system.
9. The response time guarantee type cluster system according to claim 6, wherein the threshold setting adjustment module judges R ar 、CoV ar 、T rt 、CoV st 、R sr The judging mode of the variation association degree with the service quality of the cluster system specifically comprises the following steps:
acquiring sampling values of each characteristic of the load in a continuous time period before the latest sample, wherein the sampling values comprise R ar 、CoV ar 、T rt 、CoV st 、R sr
The sampling value of the load characteristic of each time period is used as analysis statistical data to calculate the relative rate P of the service quality rq P rq And R is R ar Is of the correlation coefficient of (2)P rq With CoV ar Correlation coefficient of->P rq And T is rt Correlation coefficient of->P rq With CoV st Correlation coefficient of->And P rq And R is R sr Correlation coefficient of->Etc., wherein->In E rt Is a preset response time standard;
the obtained correlation coefficient valueAnd->The magnitude of the variable association degree between each characteristic of the judging load and the service quality of the cluster system is correspondingly determined.
CN202310574528.9A 2023-05-22 2023-05-22 Response time guarantee type cluster system and scale adjustment method thereof Pending CN116599858A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310574528.9A CN116599858A (en) 2023-05-22 2023-05-22 Response time guarantee type cluster system and scale adjustment method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310574528.9A CN116599858A (en) 2023-05-22 2023-05-22 Response time guarantee type cluster system and scale adjustment method thereof

Publications (1)

Publication Number Publication Date
CN116599858A true CN116599858A (en) 2023-08-15

Family

ID=87595171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310574528.9A Pending CN116599858A (en) 2023-05-22 2023-05-22 Response time guarantee type cluster system and scale adjustment method thereof

Country Status (1)

Country Link
CN (1) CN116599858A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117081965A (en) * 2023-10-19 2023-11-17 山东五棵松电气科技有限公司 Intranet application load on-line monitoring system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117081965A (en) * 2023-10-19 2023-11-17 山东五棵松电气科技有限公司 Intranet application load on-line monitoring system
CN117081965B (en) * 2023-10-19 2024-01-16 山东五棵松电气科技有限公司 Intranet application load on-line monitoring system

Similar Documents

Publication Publication Date Title
CN108419274B (en) Heterogeneous wireless network selection method based on utility function
CN109510715B (en) Bandwidth allocation method and device, data center and storage medium
DE60300158T2 (en) Method and system for regulating the power consumption of a network interface module in a wireless computer
EP3324304A1 (en) Data processing method, device and system
WO2021012930A1 (en) Voting node configuration method and system
US7177271B2 (en) Method and system for managing admission to a network
CN116599858A (en) Response time guarantee type cluster system and scale adjustment method thereof
CN112020098B (en) Load balancing method, device, computing equipment and computer storage medium
US11576187B2 (en) Radio frequency resource allocation method, apparatus, device and system, and storage medium
CN108924057B (en) Port flow intelligent control system of on-cloud system
CN104468752A (en) Method and system for increasing utilization rate of cloud computing resources
CN113194040A (en) Intelligent control method for instantaneous high-concurrency server thread pool congestion
WO2011031197A1 (en) Method and apparatus for cell control
CN114513470A (en) Network flow control method, device, equipment and computer readable storage medium
CN111914000B (en) Server power capping method and system based on power consumption prediction model
CN116346740A (en) Load balancing method and device
WO2020135510A1 (en) Burst load prediction method and device, storage medium and electronic device
CN112910798B (en) Automatic flow scheduling method, system, equipment and storage medium
CN108199894B (en) Data center power management and server deployment method
CN111782394B (en) Cluster service resource dynamic adjustment method based on response time perception
CN117692460A (en) Server cluster control method and system
WO2023051318A1 (en) Model training method, wireless resource scheduling method and apparatus therefor, and electronic device
CN106888237B (en) Data scheduling method and system
CN111278039B (en) User perception suppression identification method, device, equipment and medium
CN115484167B (en) Network slice shutdown method in communication network, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination