CN115509758A - Interference quantification method and system for mixed part load - Google Patents

Interference quantification method and system for mixed part load Download PDF

Info

Publication number
CN115509758A
CN115509758A CN202211222330.6A CN202211222330A CN115509758A CN 115509758 A CN115509758 A CN 115509758A CN 202211222330 A CN202211222330 A CN 202211222330A CN 115509758 A CN115509758 A CN 115509758A
Authority
CN
China
Prior art keywords
load
interference
entropy
online
mixed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211222330.6A
Other languages
Chinese (zh)
Inventor
曾绍康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
Tianyi Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Cloud Technology Co Ltd filed Critical Tianyi Cloud Technology Co Ltd
Publication of CN115509758A publication Critical patent/CN115509758A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses an interference quantification method and system aiming at mixed load, wherein the mixed load comprises an online load and an offline load, the interference quantification method comprises the steps of aiming at a characteristic index of competitive resources, acquiring first values of the characteristic index at a plurality of time points under the condition that the online load and the offline load run together, and the competitive resources comprise hardware resources which are needed to be used when the online load and the offline load run; and calculating the discrete degree among the plurality of first values to obtain an interference entropy value, and quantizing the resource competition degree of the mixed load through the interference entropy value, wherein the resource competition degree refers to the mutual interference degree when the online load and the offline load use competition resources. The quantization accuracy can be improved.

Description

Interference quantification method and system for mixed part load
Technical Field
The invention relates to the field of performance testing, in particular to an interference quantification method and system for mixed part load.
Background
The delay-sensitive online load and the low-priority offline load are mixed and deployed on the same cluster platform of the data center, and the method becomes an effective method for improving the resource utilization rate of the data center. The method obviously improves the resource utilization rate of the data center, and simultaneously introduces competition of online load and offline load for shared resources on the same platform. The load performance interference refers to a situation that instantaneous resource contention causes a significant deterioration of QoS (Quality of Service) of online load.
Quantitative analysis of load performance disturbances refers to a premise for derivative data center system design and optimization. At present, some technologies quantitatively analyze load performance interference based on system level indexes, and this method selects system level resource indexes related to load performance to measure performance interference degree, and adopts a time sequence equalization and equalization basis statistical method to perform index calculation. The analysis method cannot adapt to the diversified characteristics of the data center load, cannot reflect the instantaneous characteristics of resource competition, and is difficult to accurately reflect the load performance interference degree.
Disclosure of Invention
In view of the above, embodiments of the present invention provide an interference quantization method, an interference quantization system and a computer-readable storage medium for mixed portion load, which can improve quantization accuracy.
The invention provides an interference quantification method for mixed part load, wherein the mixed part load comprises an online load and an offline load, and the method comprises the following steps:
aiming at a characteristic index of competitive resources, under the condition that the online load and the offline load operate together, acquiring first values of the characteristic index at a plurality of time points, wherein the competitive resources comprise hardware resources which are needed to be used when the online load and the offline load operate; and
calculating the discrete degree among the first values to obtain an interference entropy value, and quantizing the resource competition degree of the mixed load through the interference entropy value, wherein the resource competition degree refers to the mutual interference degree when the competition resources are used by the online load and the offline load.
In another aspect, the present invention also provides a computer-readable storage medium for storing a computer program, which when executed by a processor implements the method as described above.
In another aspect, the present invention also provides an interference quantification system comprising a processor and a memory for storing a computer program, which when executed by the processor implements the method as described above.
In some embodiments of the present application, under a condition that the online load and the offline load operate together, first values of the characteristic indicator at a plurality of time points are obtained, and a degree of dispersion between the plurality of first values is calculated, so as to obtain an interference entropy value. The interference entropy value can reflect the instantaneous competition characteristic when the online load and the offline load use competition resources, can reflect the interference degree of the load performance more accurately in the mixed operation process, and has higher quantization precision.
Drawings
The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are schematic and are not to be understood as limiting the invention in any way, and in which:
FIG. 1 is a diagram illustrating an instruction execution model of a hot spot instruction according to an embodiment of the present application;
fig. 2 illustrates a flow chart of an interference quantification method according to an embodiment of the present application;
fig. 3 illustrates a SLE-based performance interference affinity thermodynamic diagram provided by an embodiment of the present application;
fig. 4 shows a schematic diagram of an interference quantification system provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
Before explaining the scheme of the present application, a description will be given of the related concepts in the present application.
In some embodiments, the online load (also known as an online service), typically in the form of a service, handles user requests and performs computing tasks. Such as web search services, online gaming services, e-commerce transaction services, and the like. Online loads may have higher real-time and stability requirements.
In some embodiments, the offline load (also referred to as an offline analysis job), is typically a computationally intensive batch job. For example, mapReduce and Spark job data analysis jobs, machine learning model training jobs, and the like. Offline jobs can tolerate higher operational delays and support restart of failed tasks.
In some embodiments, online loads and offline loads may run in the system. The system is used to accommodate the hardware resources required by both online and offline loads. Offline loads and online loads deployed in the same system and sharing hardware resources are collectively referred to as mixed load. I.e., mixed portion loads include online loads and offline loads. Hardware resources that may be used during the operation of online load and offline load are called contention resources. A mixed portion load includes an online load and an offline load. The hybrid load may have different categories. Wherein mixed loads including the same online load can be classified as mixed loads of the same category. If two mixed loads comprise different in-line loads, the two mixed loads may be classified as different classes of mixed loads.
In some embodiments, the load performance interference of the mixed portion load may be reflected by the application level metrics and the system level metrics. The load performance interference may refer to a case where the service quality of the online load is significantly deteriorated when the online load and the offline load of the mixed part load compete for resources. In the user plane, the service quality of the online load is remarkably deteriorated, which is mainly reflected in that the online load cannot respond to the user demand in time. For example, when a user searches through a web page, the web page may not display the search result of the user in time.
In some embodiments, the application level indicator is representative of the delay performance of the online load in the mixed portion load. Latency performance may characterize how fast the online load is from requesting use of the contention resource to receiving a response. The delay performance can be mainly embodied by the average delay and the tail delay in the process of operating the online load, and the calculated value based on the average delay or the tail delay. Wherein, the average delay refers to the average duration from the request of using the competition resource to the receiving of the response by the online load in the statistical period. Tail delay refers to the longest duration of time that the online load has from requesting use of the contention resource to responding within a statistical period.
In some embodiments, the system level indicator is representative of a characteristic indicator of competing resources. The characteristic index is used for reflecting the functional characteristics of the competitive resource in the operation process. For example, assuming that the memory is a contention resource, the memory utilization rate can be used as a system level indicator.
Before executing the method of the present application, the selection of the online load and the offline load may be performed first according to the service types that may need to be deployed. Based on the selected online load and the offline load, hardware resources (i.e., competitive resources) which are likely to be used when the online load and the offline load operate can be determined, and characteristic indexes of the hardware resources are extracted. Finally, based on the extracted characteristic indexes, the mixed load composed of the online load and the offline load can be subjected to quantitative detection of load performance interference by using the interference quantification method for the mixed load provided by the application.
The selection of online and offline loads is described below.
In some embodiments, the set of traffic types BD and the set of load delay requirements BL may be determined first. The service type set BD includes services that need to be deployed in the same system and share hardware resources. Such as BD = { artificial intelligence, big data, interactive database, HPC }. The load delay requirement set BL includes the delay performance requirements of the online load. For example, BL = {1 to 10ms, >1 to 20ms, >10s }.
In some embodiments, it may be defined according to the set of traffic types BD and the set of load delay requirements BLAnd selecting online loads and offline loads in the Benchmark suite. For example, the service type is selected to be artificial intelligence, and the delay performance is in the online load of 1-10 ms. Specifically, the threshold number of online loads may be defined as Bs max The threshold value of the number of off-line loads is Ba max . When online load and offline load are selected in the Benchmark suite, the actual number n of the selected online loads on It is desirable to be less than or equal to Bs max While the actual number of offline loads n off It is required to be less than or equal to Ba max . Thus, a load set B = { B } can be obtained i |1≤i≤n on +n off }. The load set B comprises the selected online loads and the offline loads. For ease of understanding, table 1 exemplarily shows online loads and offline loads included in one load set B.
TABLE 1 Online and offline loads encompassed by load set B
Figure BDA0003878538610000031
Figure BDA0003878538610000041
In some embodiments, based on load set B, n may be determined on *n off And loading the mixed part. Each hybrid load includes an online load and an offline load. Taking table 1 as an example, 3 × 5 mixed part loads of 3 types can be determined, which are respectively:
mixed part load of the first kind (online load Imgdnn): { Imgdnn, union }, { Imgdnn, multiply }, { Imgdnn, wordcount }, { Imgdnn, sort }, { Imgdnn, MD5}
Second type mixed part load (online load is Masstree): { Masstree, union }, { Masstree, multiply }, { Masstree, wordcount }, { Masstree, sort }, { Masstree, MD5}
Third type of mixed load (online load is Shore): { Shore, union }, { Shore, multiply }, { Shore, wordcount }, { Shore, sort }, { Shore, MD5}.
Based on the above description, the selection of online and offline loads may be accomplished.
In some embodiments, after the selection of the online load and the offline load is completed, the instruction set BC corresponding to the load set B may be determined. The process in one embodiment is described below.
First, the hotspot function of each load (including online load and offline load) in the load set B during operation can be determined separately. The online load Imgdnn in table 1 is taken as an example for explanation. The online load Imgdnn can be completely operated, then the functions in the operating process of the online load Imgdnn are subjected to descending sorting according to the hardware resource ratio, and finally the hardware resource ratio exceeds the hot spot function resource threshold Rf max As a hot spot function of the online load Imgdnn. For example, assume a hotspot function resource threshold Rf max =60% total resources of the CPU, the function called in the online load Imgdnn operation process includes a function a, a function B, a function C, and a function D, the CPU resources occupied when the function a is operated are 30%, the CPU resources occupied when the function B is operated are 80%, the CPU resources occupied when the function C is operated are 65%, and the CPU resources occupied when the function D is operated are 70%, then after the CPU resources occupied are sorted in a descending order, the following steps are performed: function B, function D, function C, function A. The CPU resources occupied in the running process of the function B, the function C and the function D exceed the hot spot function resource threshold Rf max Then the functions B, C, D can be taken as the hot spot function of the online load Imgdnn. After determining the hotspot functions of other loads in the load set B based on a method similar to the online load Imgdnn, the hotspot functions of all loads in the load set B may be merged to obtain a hotspot function set FP.
Further, a hot spot instruction of each function in the hot spot function set FP is determined. One of the functions is taken as an example for explanation. The function can be completely operated, then the instructions in the function operation process are submitted and sequenced according to the hardware resource ratio, and finally the hardware resource ratio exceeds the hot instruction resource threshold Ra max As hot spot instructions of the function. The process is similar to the above hot spot function selection, and is not described here. And (4) merging the hot instructions of each function in the hot point function set FP to obtain a hot point instruction set AP.
Further, the instructions in the hot spot instruction set AP are classified. In particular, the instructions in the hot-spot instruction set AP may be classified according to a triple instruction feature model. The triplet instruction feature model is of the form:
(instruction type, source operand type, destination operand type)
When instructions are classified, instructions belonging to the same instruction type, the same source operand type, and the same destination operand type may be regarded as instructions of the same class. The following description is made for an instruction type, a source operand type, and a destination operand type, respectively.
In some embodiments, the instruction types may be divided into data move instructions, data operation instructions, conditional predicate instructions, jump instructions, and compound instructions composed of the above categories of instructions. Table 2 exemplarily shows the division of the instruction types.
TABLE 2 instruction type partitioning
Figure BDA0003878538610000051
Figure BDA0003878538610000061
In some embodiments, the source operand type and the destination operand type are similar for non-unary operation instructions and may each include an immediate, data in a register, and data in memory. For instructions that are unary operations, the destination operand type may be determined based on the instruction type.
In some embodiments, after instructions in the hot instruction set AP are classified according to the triple instruction feature model, hot instructions may be classified and stored in the instruction set BC.
Based on the above description, an instruction set BC may be obtained. For ease of understanding, an example of the instruction set BC in one embodiment is given below:
BC={irMOV,rrMOV,rmMOV,mrMOV,irOP,rrOP,rmOP,mrOP,rCDT,mCDT,JXX,PUSH,POP,CALL,RET,irCMOV,rrCMOV,rmCMOV,mrCMOV}
in some embodiments, after obtaining the instruction set BC, the contention resource set R may be determined according to hardware resources (i.e., contention resources) that may be used in the hot spot instruction running process. The competitive resource set R is used for storing hardware resources which can be used in the hot spot instruction operation process. The following describes the determination of the contention resource set R.
In some embodiments, the lifecycle of hot spot instruction execution may be abstracted to an instruction execution model. Each hot spot instruction corresponds to an instruction execution model. Referring to fig. 1, a schematic diagram of an instruction execution model of a hot spot instruction according to an embodiment of the present application is shown. As can be seen in FIG. 1, the instruction execution model may include instruction fetch, decode, execute, write back, and update stages for hot instructions. Meanwhile, the instruction execution model can also include hardware resources which may be used by each stage of the hot spot instruction. For example, in FIG. 1, the hardware resources that may be used during the execution phase of a hot spot instruction are registers (registers) and Arithmetic Logic Units (ALUs). Therefore, the hardware resources which are possibly used in the operation process of each hot spot instruction can be determined according to the instruction execution model corresponding to each hot spot instruction. And merging the hardware resources possibly used in the operation process of each hot spot instruction to obtain a competitive resource set R.
Based on the above description, a contention resource set R may be obtained. For ease of understanding, an example of a contention resource set R in one embodiment is given below:
R={L1I TLB,L1D TLB,L2 TLB,L1I Cache,L1D Cache,L2 Cache,L3 Cache,Memory,DISK}
in some embodiments, after obtaining the contention resource set R, the hardware resources in the contention resource set R may be subjected to characteristic index extraction. The characteristic index extraction may be performed according to a functional characteristic of a hardware resource that needs attention. For example, the characteristic index of the hardware resource may be extracted according to the memory behavior, the IO performance position of the disk, and the calculation behavior. The extracted characteristic indicators may be stored in a set of characteristic indicators T. For ease of understanding, an example of a set of feature metrics T is given in table 3:
TABLE 3 feature index set T
Figure BDA0003878538610000071
In table 3, the feature indicators numbered 1 to 2 may be feature indicators of memory behaviors, the feature indicators numbered 3 to 4 may be feature indicators of IO behaviors of a disk, and the feature indicators numbered 5 to 6 may be feature indicators of computational behaviors.
Based on the above description, the extraction of the feature index is completed.
In some embodiments, by defining the set of service types BD, the possible hybrid load, the competitive resources, and the characteristic indicators can be selected as much as possible from one or more larger service domains, so that the finally extracted characteristic indicators are more comprehensive. Meanwhile, competitive resources are extracted through the hot spot functions and the hot spot instructions, functions and instructions with low relevance can be effectively reduced, and analysis cost is reduced.
In some embodiments, based on the extracted characteristic index, the above n may be measured by the interference quantification method for the mixed load provided by the present application on *n off The individual mixed part loads respectively carry out quantitative detection of load performance interference. When the quantitative detection of the load performance interference is carried out, the following conditions need to be met:
1) Devices (e.g., computers) operating mixed-part loads do not turn on hyper-threading or turbo technology.
2) Data is collected using software tools with negligible performance overhead.
3) The mixed part load is input by considering the system resource allocation and is not overloaded.
In some embodiments, the interference quantification method for the mixed load may be applied to an electronic device, wherein the electronic device includes, but is not limited to, a tablet computer, a server, a notebook computer, and a desktop computer. Referring to fig. 2, a flowchart of an interference quantification method according to an embodiment of the present application is shown. The following steps S21 and S22 may be performed for each mixing section load, respectively.
Step S21, aiming at the characteristic indexes of competitive resources, under the condition that the online load and the offline load operate together, first values of the characteristic indexes at a plurality of time points are obtained, and the competitive resources comprise hardware resources which need to be used when the online load and the offline load operate.
In some embodiments, the online load and the offline load are online loads and offline loads included in the same mixed load. The hardware resources may be used in the processes of competing resources, i.e., online load and offline load operation. Such as memory, CPU, cache, registers, hard disk, etc. Based on the above description, it can be understood that the contention resource may include, in addition to the hardware resource that needs to be used in both the online load operation and the offline load operation, a hardware resource that needs to be used in one of the online load operation and the offline load operation, or a hardware resource that may not be used in both the online load operation and the offline load operation. The feature indicators of the competitive resources are the feature indicators in the extracted feature indicator set T.
In some embodiments, the first value of the characteristic measure is data of the characteristic measure collected by a software tool performing data collection when the online load and the offline load are running together. When the first value of the characteristic index is obtained, data acquisition may be performed according to a preset statistical period. Specifically, data collection may be performed every preset time period within the statistical period. For example, assuming a statistical period of 1 minute, data acquisition may be performed every 3 seconds. In this way, a plurality of first values of each characteristic index in the statistical period can be obtained.
And S22, calculating the discrete degree among the plurality of first values to obtain an interference entropy value, and quantizing the resource competition degree of the mixed part load through the interference entropy value, wherein the resource competition degree refers to the mutual interference degree when the online load and the offline load use competition resources.
In some embodiments, the degree of dispersion between the plurality of first values may be calculated based on a calculation of entropy. The larger the interference entropy value is, the larger the dispersion degree between the plurality of first values can be represented, the more unstable the value of the characteristic index is, and the mutual interference between the online load and the offline load when the competitive resource is used is larger. Correspondingly, the smaller the interference entropy value is, the smaller the dispersion degree between the first values is, the more stable the value of the characteristic index is, and the mutual interference between the online load and the offline load is smaller when the competitive resource is used.
In some embodiments, the interference entropy value may be calculated by a performance interference quantization model. That is, the first values of the characteristic index obtained in step S21 at a plurality of time points are input into the performance interference quantization model, and the degree of dispersion between the plurality of first values is calculated by the performance interference quantization model, so as to obtain the interference entropy value. The calculation process of the performance interference quantification model is explained below.
In some embodiments, in a case that the contention resource includes a plurality of characteristic indicators, a degree of dispersion between the plurality of first values of each characteristic indicator may be calculated, respectively, to obtain first intermediate entropy values corresponding to the respective characteristic indicators. For example, assume that there are feature index a, feature index b, and feature index c. Obtaining a first intermediate entropy value corresponding to the characteristic index a according to first values of the characteristic index a at a plurality of time points in a statistical period; obtaining a first intermediate entropy value corresponding to the characteristic index b according to first values of the characteristic index b at a plurality of time points in a statistical cycle; similarly, a first intermediate entropy value corresponding to the characteristic indicator c may be obtained.
In some embodiments, for each feature indicator, a first intermediate entropy value for the respective feature indicator may be calculated based on expression (1) below, respectively.
Figure BDA0003878538610000091
Wherein, H (V) i ) Representing the characteristic index t when mixed part load (namely online load and offline load) operates i A first intermediate entropy value of;
V i indicating a characteristic index t i Set of first values at a plurality of time points within a statistical period (also called characteristic indicator t) i I is greater than or equal to 1 and less than or equal to N t ,N t Representing the number of characteristic indexes;
v represents V i One of the first values;
p (V) denotes V at V i The probability of occurrence of (a);
num (V) indicates that V is at V i The number of (1);
N v represents V i Of the first value of (a).
In some embodiments, for any feature measure, the first intermediate entropy value of the feature measure is used to represent a degree of dispersion between the first values of the feature measure. It can be understood that the larger the first intermediate entropy value of the characteristic index indicates that the online load and the offline load have a larger influence on the characteristic of the index when using the competitive resources. By using the index characteristic as a reference, it can be determined that the larger the degree of mutual interference when the online load and the offline load use the contention resource is. For example, assuming that the first intermediate entropy of the characteristic index a is 3 and the first intermediate entropy of the characteristic index b is 5, it indicates that the first value of the characteristic index b is relatively discrete and has relatively large fluctuation, and the first value of the characteristic index a has relatively small dispersion degree and relatively small fluctuation. Judging by taking the characteristic index a as a reference, wherein the mutual interference degree is smaller when the online load and the offline load use competitive resources; and the characteristic index b is used as a reference for judgment, so that the mutual interference degree is large when the online load and the offline load use competitive resources.
In some embodiments, the interference entropy value may be derived based on a first intermediate entropy value of the characteristic indicator according to an accumulation property of entropy. Specifically, the first intermediate entropy values of the respective characteristic indicators may be added to obtain the interference entropy value. Therefore, the first intermediate entropy values of the characteristic indexes are integrated, and the mutual interference degree when the online load and the offline load use competitive resources is reflected on the whole. This process can be expressed by expression (2):
Figure BDA0003878538610000092
wherein f (x, T) represents an energy interference quantification model established based on the characteristic indexes in the characteristic index set T, x represents the mixed part load currently running, and N represents the energy interference quantification model t Number of characteristic indexes, H (V) i ) Indicating a characteristic index t i The first intermediate entropy value of (a).
In some embodiments, the above scheme of directly adding the obtained interference entropy may have a problem of low precision. Specifically, the method comprises the following steps:
in some embodiments, the larger the first intermediate entropy of the characteristic indicator is, the larger the degree of mutual interference when the online load and the offline load use the contention resource is; the smaller the first intermediate entropy of the characteristic index is, the smaller the degree of mutual interference when the online load and the offline load use the competitive resources is. However, in some abnormal situations, the first intermediate entropy value is small and does not represent that the online load and the offline load have a small degree of mutual interference when using the contention resource. For example, when the contention resources are used, the contention resources are stuck due to the online load and the offline load, so that the characteristic indicator of the contention resources continuously maintains a higher first value (e.g., the CPU utilization continuously is 100%). In this case, although the degree of dispersion of the first value of the characteristic index is small, and the obtained first intermediate entropy value is small, it does not indicate that the degree of mutual interference is small when the online load and the offline load use the contention resource. In contrast, in this case, the degree of mutual interference when the online load and the offline load use the contention resource is relatively large. In view of this, the method of the present application further comprises:
in the case of the online load alone operation, second values of the characteristic index at a plurality of time points are acquired, and in the case of the offline load alone operation, third values of the characteristic index at a plurality of time points are acquired. It can be understood that the second value of the characteristic index is a value acquired under the condition that the offline load does not interfere with the online load, and the discrete degree of the second value may be the minimum discrete degree of the value of the characteristic index when the online load runs. Similarly, the third value of the characteristic index is a value acquired under the condition that the online load does not interfere with the offline load, and the discrete degree of the third value may be the minimum discrete degree of the value of the characteristic index when the offline load operates. Therefore, an entropy threshold value can be determined according to the second value and the third value of the characteristic index, a target characteristic index with the value of the first intermediate entropy larger than the entropy threshold value is determined in the characteristic index, and the interference entropy value is obtained based on the first intermediate entropy of the target characteristic index. Therefore, the abnormal scene of the dead competition resource is eliminated, and the detection accuracy is improved.
In some embodiments, in determining the entropy threshold, a discrete degree between the second values of each feature indicator may be calculated respectively, to obtain a second intermediate entropy value corresponding to each feature indicator, and a discrete degree between the third values of each feature indicator may be calculated respectively, to obtain a third intermediate entropy value corresponding to each feature indicator, and then a minimum value of the second intermediate threshold and the third intermediate threshold may be used as the entropy threshold. The formula for calculating the second intermediate entropy value and the third intermediate entropy value can be referred to as expression (1), which is not described herein again. For convenience of description, the characteristic index t is assumed i Is H (V _ on) i ) The third intermediate entropy value is H (V _ off) i ) Then, a first intermediate entropy value of the target feature indicator may be selected according to expression (3):
H(V i )≥min(H(V_on i ),H(V_off i )) (3)
therefore, the interference entropy value can be calculated according to the first intermediate entropy value of the selected target characteristic index, so that the accuracy of the interference entropy value is improved. Expression (3) can be used as a constraint condition of the performance interference quantification model.
Further, for any characteristic index, in the mixed load operation process (that is, the online load and the offline load operate simultaneously), due to the characteristic difference between the online load and the offline load, the influence of the online load and the offline load on the characteristic index is different, and finally, the first intermediate entropy value of the characteristic index is influenced. I.e. there is entropy noise in the first intermediate entropy value. For example, when the online load operates alone, the CPU utilization rate is 30%, and when the offline load operates alone, the CPU utilization rate is 20%, then when the online load and the offline load operate simultaneously, the online load has a greater influence on the CPU utilization rate, and in the first intermediate entropy value of the CPU utilization rate, the online load has a greater influence. In view of this, in order to eliminate the characteristic difference between the online load and the offline load, the first intermediate entropy of each target characteristic index may be normalized to obtain the optimized entropy corresponding to each target characteristic index. The optimized entropy value is a value obtained by removing entropy noise from the first intermediate entropy value.
Specifically, for any one of the target characteristic indexes, the first intermediate entropy of the target characteristic index is normalized with the sum of the second intermediate entropy and the third intermediate entropy of the target characteristic index as a reference. The first intermediate entropy value of each target feature index may be normalized according to expression (4).
Figure BDA0003878538610000101
Wherein, E (V) i ) Representing target characteristic index t i An optimized entropy value of (a);
H(V i ) Indicating a characteristic index t i A first intermediate entropy value of;
H(V_on i ) For the independent operation of the online load, the characteristic index t i A second intermediate entropy value of;
H(V_off i ) Characteristic index t for independent operation of offline load i To the third intermediate entropy value of (b).
As such, the interference entropy value may be calculated based on the optimized entropy value for each target characteristic indicator to improve the accuracy of the interference entropy value.
Furthermore, when the cumulative characteristic of the entropy is considered, the characteristic indexes need to satisfy mutually independent conditions. Since the condition is not satisfied between the characteristic indexes of the application, a weight is introduced to each characteristic index. Specifically, when the interference entropy is obtained based on the optimized entropy corresponding to the target characteristic indicators, the optimized entropy of each target characteristic indicator may be multiplied by the weight according to the weight of the optimized entropy of each target characteristic indicator, and then added, and the result obtained by the addition may be used as the interference entropy. Thus, the accuracy of the interference entropy value is improved. In some embodiments, for any target feature indicator, the weight of the optimized entropy value of the target feature indicator may be determined based on the following method:
under the condition that the online load and the offline load operate together, detecting the delay performance of the online load, then performing correlation calculation on the first value of the target characteristic index and the delay performance in the online load operation process, and taking the calculated value as the weight of the optimized entropy value of the target characteristic index, wherein the delay performance represents the speed of the online load from the request of using competitive resources to the receiving of response. Specifically, the first value of the target characteristic index reflects the mutual interference degree when the online load and the offline load use the competitive resources from the system level (namely, the hardware resource level); the delay performance during the operation of the online load reflects the mutual interference degree when the online load and the offline load use the competitive resources from the application level (i.e. the delay performance level of the online load). Different target characteristic indicators may have different degrees of correlation with the delay performance during online load operation. In view of this, a correlation calculation may be performed between the first value of the target feature index and the delay performance during online load operation based on expression (5):
Figure BDA0003878538610000111
wherein, w i Representing target characteristic index t i Degree of correlation with delay performance during online load operation, i.e. target characteristic index t i The weight of (c);
AL is a set of average delays (also called samples of average delays) of the online load at multiple points in time during the statistical period;
V i indicating a characteristic index t i Set of first values at a plurality of time points within a statistical period (also called characteristic indicator t) i A sample of a first value of);
μ AL is the average of the average delays;
σ AL is the variance of the mean delay;
Figure BDA0003878538610000112
is a characteristic index t i Is measured.
As such, based on expression (6), the interference entropy value SLE _ o can be calculated.
Figure BDA0003878538610000121
The interference entropy value SLE _ o is an index (system level index) at a hardware resource level, and is used for embodying instantaneous competition characteristics of online loads and offline loads when competition resources are used. For example, if the usage demands of the online load and the offline load on the CPU are both high, the discrete degree of the first value of the CPU utilization rate is high, and is further reflected on the interference entropy value SLE _ o, that is, the interference entropy value SLE _ o is large.
In some embodiments, the interference entropy value SLE _ o may reflect the delay performance (application level indicator) of the online load. Here, the delay performance of the online load is represented by a tail-to-average ratio. The tail-to-average ratio is the ratio of the tail delay to the average delay of the on-line load. The delay performance of the online loads is reflected through the tail-to-average ratio, and the delay magnitude difference caused by the characteristic difference between different online loads can be eliminated. In this way, the application-level index of the mixed portion load can be reflected by the system-level index of the mixed portion load. Specifically, for the same mixed load running in the same system, if the calculated interference entropy value SLE _ o is larger, it can indicate that the tail-to-average ratio of the online load is larger; conversely, if the calculated interference entropy value SLE _ o is smaller, it can indicate that the tail-to-average ratio of the online load is smaller.
In some embodiments of the present application, under a condition that the online load and the offline load operate together, first values of the characteristic indicator at a plurality of time points are obtained, and a degree of dispersion between the plurality of first values is calculated, so as to obtain an interference entropy value. The interference entropy value can reflect the instantaneous competition characteristic when the online load and the offline load use competition resources, can reflect the interference degree of the load performance more accurately in the mixed operation process, and has higher quantization precision.
Further, in some embodiments, if multiple systems exist, and at least two of the multiple systems have different hardware resources (which may also be referred to as having different system configurations). For ease of understanding, table 4 illustratively lists system configuration information for three systems.
Table 4 system configuration information
Figure BDA0003878538610000122
Figure BDA0003878538610000131
As can be seen from table 4, the systems s1 and s2 have the same system configuration information, and the system configuration information of the system s3 is different from the system configuration information of the systems s1, s 2. It will be appreciated that the extracted feature index sets T may be different for systems of different system configurations, such as different processing architectures. Specifically, the number of feature indicators in the feature indicator set T may be different. If the same mixed load runs in two systems with different system configurations, the interference entropy value SLE _ o calculated according to the first value of the acquired characteristic index has no comparability, namely based on the calculated interference entropy value SLE _ o, the load performance interference comparison across loads cannot be realized. In view of this, in some embodiments, where the same mix load is respectively operated in multiple systems, the method of the present application further comprises:
determining interference entropy values of mixed part loads when the mixed part loads operate in each system;
and performing normalization calculation on the interference entropy values of the mixed part load when the mixed part load operates in each system to obtain second interference entropy values corresponding to the mixed part load in each system, so as to compare the resource competition degree of the mixed part load when the mixed part load operates in each system through the second interference entropy values.
Specifically, when the interference entropy value of the mixed part load when running in each system is normalized, the number of the characteristic indexes in each system may be determined for each of the plurality of systems, and the number of the characteristic indexes in the plurality of systems may be normalized with the largest number of the characteristic indexes as a reference, so as to obtain the first normalization coefficient corresponding to each system. Wherein, the calculation formula of the first normalization coefficient is shown as expression (7):
Figure BDA0003878538610000132
wherein kn represents a first normalization coefficient;
NT represents a set of the number of characteristic indexes in each system;
N t representing the number of characteristic indicators in the tth system.
In some embodiments, for any system, a first normalization coefficient corresponding to the system may be multiplied by the interference entropy value when the mixture load operates in the system, so as to perform a normalization calculation on the interference entropy value when the mixture load operates in the system, where a calculation formula of the normalization calculation is shown in expression (8).
SLE1=kn·SLE_o (8)
SLE1 is a value obtained by performing normalization calculation on the interference entropy value SLE _ o based on the first normalization coefficient. For the same mixed part load, the SLE1 value of the mixed part load under different systems can reflect the load performance interference degree of the mixed part load in different systems. For example, if the SLE1 value of the hybrid load a in the system 1 is 5 and the SLE1 value in the system 2 is 4, it can be determined that the load performance of the hybrid load a in the system 1 is disturbed to a greater extent.
Further, in some embodiments, where there are multiple mix loads, it is assumed that the multiple mix loads operate in the same system. Since the characteristics of the plurality of mixture loads are different, when the interference entropy value SLE _ o is the same, it does not mean that the load performance of the plurality of mixture loads is the same in degree of interference. For example, assume that there are two characteristic metrics: CPU utilization and the number of instructions executed per cycle. It is assumed that when the mixed part load a is operated, the influence on the CPU utilization rate is large and the fluctuation of the CPU utilization rate is large, whereas when the mixed part load B is operated, the influence on the number of instructions executed in each cycle is large and the fluctuation of the number of instructions executed in each cycle is large. As can be seen from the above principle of calculating the interference entropy SLE _ o, the interference entropy SLE _ o calculated for the mixed portion load a and the mixed portion load B may be the same. However, for the CPU utilization, the possible interference entropy value SLE _ o is 3, which already indicates a large degree of load performance interference, but for the number of instructions executed per cycle, the possible interference entropy value SLE _ o is 5, which indicates a low degree of load performance interference. That is, the interference entropy value SLE _ o cannot realize load performance interference comparison across loads. In view of this, for the same system, in the presence of a plurality of said mix loads, the method of the present application further comprises:
respectively determining an interference entropy value of each mixed part during load operation;
carrying out normalization calculation on the interference entropy value of each mixed part load to obtain a first interference entropy value corresponding to each mixed part load;
and comparing the resource competition degrees among the mixed part loads based on the first interference entropy values corresponding to the mixed part loads. The method for carrying out normalization calculation on the interference entropy value of each mixed part load comprises the following steps:
the delay performance of the online load in each mixed load is detected. Wherein the delay performance of the on-line load can be represented by a tail-to-average ratio of the on-line load. In addition, in order to eliminate magnitude influence caused by the characteristics of different online loads, normalization processing can be performed on tail-to-average ratios of the online loads of different mixed part loads to obtain relative tail-to-average ratios, and the delay performance of the online loads in the mixed part loads can be reflected through the relative tail-to-average ratios. For any mixed load, the tail-to-average ratio of the online load of the mixed load may be normalized based on expression (8).
Figure BDA0003878538610000141
Wherein l is the tail-to-average ratio of the online load of the mixed part load, and represents the tail-to-average ratio of the online load when the online load and the offline load of the mixed part load operate simultaneously;
l on tail-to-average ratio of the mixed part load when the online load operates independently;
l' is the relative tail-to-tail ratio of the in-line loading of the mixed-section loading.
Further, for any one of the mix portion loads, a degree of mapping between the interference entropy value SLE _ o of the mix portion load and the delay performance may be determined, wherein the degree of mapping represents the amount of change in the delay performance per unit value of change in the interference entropy value. For any of the mix load, the degree of mapping of the mix load may be determined based on expression (9).
Figure BDA0003878538610000142
Where d1 represents the degree of mapping of the mixed portion load in the same system.
Further, the interference entropy value of each mixture load may be normalized based on the mapping degree of the mixture load. Specifically, the mapping degrees of the loads of the mixing parts are respectively subjected to normalization calculation by taking the mapping degree with the largest value as a reference, so as to obtain a second normalization coefficient. The second normalization coefficient may be calculated based on expression (10).
Figure BDA0003878538610000143
Wherein kd1 represents a second normalization coefficient when the mixed part load is in the same system;
d1 represents a set of mapping degrees of all mixed part loads under the same system.
Further, for any mixture load, the second normalization coefficient of the mixture load may be multiplied by the interference entropy value to perform normalization calculation on the interference entropy value of the mixture load. The interference entropy value may be normalized based on expression (11).
SLE2=kd1·SLE_o (11)
And SLE2 is a value obtained by carrying out normalization calculation on the interference entropy SLE _ o based on the second normalization coefficient kd 1. For different mixed loads operating in the same system, the SLE2 value of each mixed load is used for reflecting the load performance interference degree of the mixed load in the system. For example, in the system 1, if the SLE2 value of the mixed part load a is 5 and the SLE2 value of the mixed part load B is 4, it can be determined that the load performance of the mixed part load a in the system 1 is disturbed to a greater extent. Thus, cross-load comparison of load performance interference degrees is achieved.
Further, for different mixed part loads under different systems, the above formula (7) may be referred to first to obtain a first normalization coefficient kn corresponding to each system, and then the mapping degree d of each mixed part load under different systems is calculated through the expression (12) by combining the first normalization coefficient kn and the relative tail-to-average ratio l' of each mixed part load.
Figure BDA0003878538610000151
Further, based on expression (13), the mapping degree of each mixture load under different systems can be normalized.
Figure BDA0003878538610000152
Wherein kd represents a second normalization coefficient when the mixed part load is under different systems;
d represents the set of degrees of mapping of all mix loads under all systems.
Further, based on expression (14), a normalization calculation may be performed on the interference entropy value.
SLE=kn·kd·SLE_o (14)
And SLE is a value obtained by carrying out normalization calculation on the interference entropy value SLE _ o based on the first planning system kn and the second normalization coefficient kd. Therefore, load performance interference comparison among different mixed loads under different systems can be realized, namely, cross-system and cross-load performance interference comparison is realized.
For convenience of understanding, the technical solution of the present application is described below with reference to an embodiment.
In this embodiment, the basic parameter settings are as follows: load set B size threshold B max =8, number threshold Bs of online loads in load set B max =3, number threshold of offline loads Ba max =5, hotspot function resource threshold Rf max =60% CPU resource, hot Instructions resource threshold Ra max =80% CPU resource, threshold value G of number of data acquisitions in performance interference calculation max =5. The embodiment is implemented on an X86 platform, and for RISC processors, the binary machine code and instructions are the same, and for CISC processors, the instructions are microcode of the binary machine code. Since research proves that CISC and RISC are fully fused and the difference of the CISC and RISC on the instruction set is smaller and smaller nowadays, the method of the invention is also suitable for other architectures.
The present embodiment can be divided into the following steps:
1. extracting load set B
1.1 ) initializing load sets
Figure BDA0003878538610000162
1.2 Define the load application domain set BD = { artificial intelligence, big data, interactive database, HPC };
1.3 Define the load delay requirement set BL = { 1-10ms, > 1-20 ms, >10};
1.4 Based on a typical mixed load Benchmark suite DCMIX, a load set B = { Imgdnn, masstree, shore, union, multiply, wordcount, sort, MD5} is obtained. The load set B is shown in table 5. The load functions in load set B are shown in table 6. The load set B may constitute 3 × 5 mixed loads.
Table 5 load set B
Load name Type of load Field of application Delay requirement
Imgdnn On-line load Artificial intelligence 1~20ms
Masstree On-line load Big data 1~10ms
Shore On-line load Interactive database 1~10ms
Union Offline load Interactive database >10s
Multiply Offline load HPC >10s
Wordcount Offline load Big data >10s
Sort Offline load Big data >10s
MD5 Offline load HPC >10s
TABLE 6 load function
Figure BDA0003878538610000161
Figure BDA0003878538610000171
2. Fetching instruction set BC
2.1 ) initialization instruction set
Figure BDA0003878538610000172
2.2 For all loads in the load set B), respectively adopting a Linux performance analysis tool Perf to count hot spots, wherein the load input is configured according to the principle that the system resources are fully loaded and not overloaded, and 3.6 hot spot functions and 9.4 hot spot instructions are averaged to meet the corresponding constraint Rf max 、Ra max Get the instruction set BC = { irMOV, rrMOV, rmMOV, mrMOV, iroP, rrOP, rmOP, mrOP, rCDT,
mCDT, JXX, PUSH, POP, CALL, RET, irCMOV, rrCMOV, rmCMOV, mrCMOV }. Wherein, MOV represents Move data Move instruction, OP represents Operation data Operation instruction, CDT represents Condition judgment instruction, JXX represents Jump and Condition Jump instruction. For example, instructions movl% eax, (% rsp) and movw% dx, (% rax) all belong to the (Move data Move, register, memory) category, which may be further simplified as irMOV, where i, r represent the english initials of register and memory, respectively.
3. Extracting a set of competing resources R
3.1 All potential sets of competing resources are derived from all instructions in the instruction set BC: r = { L1I TLB, L1D TLB, L2 TLB, L1I Cache, L1D Cache, L2 Cache, L3 Cache, memory, DISK }.
4. Extracting characteristic index set T
4.1 MB and DB = { basic behavior index, fine-grained behavior index, and hierarchical behavior index embodying interaction between resources }, CB = { basic behavior index IPC, branch behavior, and migration behavior }, and under a typical X86 ISA and a typical Intel Xeon system (Westmere architecture), a global characteristic index set T is obtained, and specific information is shown in table 7, where indexes 1 to 42 are memory behavior indexes, indexes 43 to 45 are disk IO behavior indexes, and indexes 46 to 49 are calculation behavior indexes. In addition, table 8 shows the results obtained by the general method of selecting feature indexes according to resource behaviors under the Skylake processor architecture. The major characteristic index part is realized by extracting corresponding hardware events by a performance analysis tool Perf, and other indexes, such as the memory utilization rate, are calculated by extracting corresponding performance information from Proc files provided by a Linux system;
TABLE 7 partial feature indicators under Westmere processor architecture
Figure BDA0003878538610000181
Table 8 fractional feature metrics under Skylake processor architecture
Figure BDA0003878538610000182
5. Performance interference quantification model construction
5.1 Constructing a performance interference quantification model, defining the output of the model without normalization as a native system-level entropy index SLE _ o, and defining the normalized output as a system-level entropy index SLE (or SLE1, SLE 2). After the process of building SLE _ o, starting from the underlying entropy quantization model, the accuracy is progressively evaluated to demonstrate the effect of model optimization, and the pearson correlation coefficients of the output-to-tail average ratio of the model during optimization for all class mixture loads are shown in table 9. From table 9, it can be seen that compared with the basic quantitative model, the model has significantly improved accuracy after going through the progressive optimization process of 3 aspects, and finally reaches the optimal value. The above results prove that the optimization method of the invention is effective on the basis of the basic performance interference model based on entropy measurement;
correlation degree of output and tail-to-average ratio in table 9 model optimization process
Figure BDA0003878538610000183
6. Performance interference calculation
6.1 Load set B = { Imgdnn, masstree, shore, union, multiply,
wordcount, sort, MD5, the system set is S = { S = } 1 ,s 2 ,s 3 The specific information of the system in S is shown in table 10, and all the mixed parts in the load set B are calculatedThe degree of performance interference of the load under all systems in S. Defining the calculation result as a set F, initializing
Figure BDA0003878538610000193
Table 10 system configuration information
Figure BDA0003878538610000191
6.2 For a mixed load set X = { X) composed of a load set B i All x in |1 ≦ i ≦ 15 ≦ i Setting the types and the applications of loads across mixed parts in a space to be measured, configuring Benchmark input according to a table 11, positioning an event by using Perf according to a table 12, setting and collecting time sequence sample values of which characteristic indexes are acquired at intervals of 10s, and obtaining a calculation result F of the SLE, wherein the calculation result F is shown in a table 13, except for the space to be measured formed by a load set B and a system set, a baseline of each type of mixed part load on different systems is also included, the baseline is equivalent to a load of which the mixed part runs an idle running, and the load is an ideal state of which the performance interference is 0 under the current mixed part system.
Table 11 Benchmark configurations
Figure BDA0003878538610000192
Figure BDA0003878538610000201
Event code and mask for table 12 feature index
Index number Westmere Skylake
1 0185 0185
2 1085 1085
…… …… ……
TABLE 13 calculation of System level entropy SLE
Figure BDA0003878538610000202
Figure BDA0003878538610000211
The calculation of one of the entries in table 13 is illustrated using a specific example: setting the load of the mixed part calculated at present as Imgdnn-Wordcount, and setting the current system as s 1 First, the entropy value of each characteristic index in T under Imgdnn-Wordcount mixed operation, imgdnn independent operation and Wordcount independent operation is calculated. And (4) screening indexes which do not meet the constraint condition through a formula (3), and rejecting 6 indexes in total, such as an index snoop _ state _ PKI, wherein the entropy (3.75) in a mixed mode is smaller than the entropy (5.17) in an online service independent mode and the entropy (4.38) in an offline analysis operation independent mode. Then, entropy-denoise by equation (4), for example, entropy (4.62) of index l2_ wb _ l3_ PKI, is analyzed by entropy (5.25) in online service independent mode and entropy (2) in offline job independent mode.45 Update for E (0.60). Then, the weight w of each feature index is calculated by equation (5), where l3_ miss _ PKI has the highest weight w, which is 0.89. The result of the native system-level entropy index SLE _ o for Imgdnn-Wordcount, calculated by equation (6), is 9.91. Further normalization process due to s 1 The number of indices for Westmere architecture features is 49, while there is also a system S in the system set S that is equipped with a Skylake processor architecture 3 The number of characteristic indexes thereof is 42, and thus the normalization coefficient kn is calculated as 1 by the formula (7). According to the formula (8), the relative tail-to-tail average ratio TL _ n of Imgdnn-Wordcount is calculated to be 3.37 through the tail-to-tail average ratio (5.53) of Imgdnn-Wordcount and the tail-to-average ratio (1.64) of Imgdnn in independent operation, so that according to the formula (12), the mapping degree d of Imgdnn-Wordcount is 2.94, the mapping degree of the space load set B to be measured and the maximum mapping degree of the system set S are calculated to be 10.48 additionally, and the mapping degree is S at S 3 Imgdnn-Sort of the upper run reached, so the normalized coefficient kd of Imgdnn-Wordcount was calculated to be 3.56 according to equation (13). Finally, according to the formula (14), obtaining the value of the Imgdnn-Wordcount system level entropy index SLE as 35.32;
7. and (4) ending: the quantification of the mixed portion load performance disturbance is suspended.
Based on the quantification of the mixed load performance interference according to the present invention, the effect of SLE and the performance of the quantification model are evaluated based on the calculation results of SLE in table 13. Furthermore, 2 cases of SLE applications are given, one based on the mixed load affinity analysis of SLE and the other using SLE to evaluate different isolation mechanisms.
(1) Equivalence comparison
By using the system level entropy index SLE, equivalent one-by-one comparison can be carried out on the performance interference under the same measurement standard.
Comparison of s1 and s2: the average SLE values for s1 and s2 were 23.62 and 19.08, respectively, indicating that the overall performance interference level for s2 was less than s1. In addition, SLE values for 10 of the 15 mix loads on s2 are less than for system s1, such as Imgdnn-Wordcount, imgdnn-Sort, masstree-Wordcount, masstree-Sort, shore-Wordcount, and Shore-Sort. The largest differences between s1 and s2 are the different memory resource sizes, s2 memoryThe capacity is 3 times s1, so that the memory sensitive load or the load with more memory resource behavior is affected. Taking Imgdnn-Wordcount as an example, compared with s1, the SLE value of the Imgdnn-Wordcount is reduced by 69.5% on s2, and the weights of characteristic indexes of the Imgdnn-Wordcount on two systems are analyzed, wherein the difference is quite obvious and can be divided into 2 classes, one class is TLB behaviors, such as itlb _ miss _ PKI and dtlb _ miss _ PKI, and s1 is respectively 3.6 times and 3.3 times of s 2; the other is shared Cache, i.e., L3 Cache behavior, such as L3_ miss _ PKI, offcore _ rddata _ PKI, and snoop _ code _ PKI, with s1 being 3.2, 2.8, and 6.1 times s2, respectively. The former category illustrates that at s2, the TLB is less stressed to manage Page Table (Page Table) coherency, while the latter category illustrates that at s2 the overhead of maintaining cache coherency is lower. Relative, s 2 SLE of the upper remaining 5 mixed part loads is greater than s 1 And they are mainly mixed and run off-line analysis operations Union or MD5, imgdnn-Union, imgdnn-MD5, masstree-Union, shore-Union and Shore-MD5, respectively. Therefore, it is inferred that the larger memory resource capacity may entail longer disk IO operation overhead. In addition, the mixed load with the maximum SLE value in all cases is Shore-Union, is composed of 2 disk IO sensitive loads and operates at s 2 The method is achieved in the above way. Therefore, the larger memory resource capacity can cause more disk IO resource competition.
Comparison s 1 And s 3 :s 1 And s 3 Mean SLE values of 23.62 and 14.62, respectively, indicating that s 3 Overall performance interference level less than s 1 。s 3 SLE values of 13 mixed part loads in 15 most of the mixed part loads are less than s 1 Such as Imgdnn-Union, imgdnn-Multiply, masstree-Sort, shore-Union, and Shore-Wordcount. Although s is 3 Fewer CPU cores and less storage capacity are available, but there will be both weaker CPU contention (e.g., context switching, thread transfers) and storage resource contention (e.g., L3 Cache, memory).
(2) Upper limit of system performance
A general method for evaluating the performance upper limit of the system under different types of mixed loads is provided by utilizing the system level entropy index SLE. For the same type of mixed part load, the difference between the SLE of the mixed part load and the baseline is smaller, which indicates that the mixed part scheme is closer to the upper performance limit of the system, and the mixed part efficiency is higher; the larger the gap, the more the blending solution deviates from the upper performance limit of the system, and the lower its blending efficiency. For any mixed part load, defining the system mixed part efficiency as the baseline divided by the system level entropy index SLE and then multiplied by 100 percent, such as an expression (15);
Figure BDA0003878538610000221
table 14 gives the efficiency of the blend section in the space to be measured. The mixing efficiency of baseline for all mixing operation Idle loads is 100%, and the advantage of defining the mixing efficiency by the baseline is to intuitively tell people how much space is left for system optimization, for example, masstree-Wordcount has mixed efficiency of 72%, 99% and 73% on 3 different systems, respectively, so it is most suitable for deployment in s 2 At s, above 2 The resulting performance interference level of (a) is already very close to the upper limit of the system, i.e. the ideal state without performance interference.
At s 1 In (3), the average mixing efficiency of Imgdnn-related load is 63%, the average mixing efficiency of Masstree-related load is 75%, the average mixing efficiency of Shore-related load is 39%, and the average mixing efficiency of the whole is 59%; at s 2 In (3), the average mixed efficiency of Imgdnn-related load is 85%, the average mixed efficiency of Masstree-related load is 87%, the average mixed efficiency of Shore-related load is 41%, and the average mixed efficiency of the whole is 71%; at s 3 In (3), the average mixed efficiency of Imgdnn-related load was 87%, the average mixed efficiency of Masstree-related load was 77%, the average mixed efficiency of Shore-related load was 61%, and the average mixed efficiency of the whole was 75%. Therefore, the Imgdnn-related load is most suitable for deployment at s 3 In the above, mastree related load is most suitably deployed in s 2 In the above, the Shore-related load is most suitably deployed at s 3 In addition, the total 3 x 5 mixed load is most suitable to be deployed ats 3 The above.
Table 14 mixed part efficiency calculation results
Mixed load s 1 s 2 s 3
Imgdnn-Union 92% 86% 93%
Imgdnn-Multiply 81% 82% 96%
Imgdnn-Wordcount 30% 97% 80%
Imgdnn-Sort 27% 82% 70%
Imgdnn-MD5 87% 80% 93%
Imgdnn-Idle 100% 100% 100%
Masstree-Union 93% 78% 75%
Masstree-Multiply 72% 87% 94%
Masstree-Wordcount 72% 99% 73%
Masstree-Sort 60% 82% 71%
Masstree-MD5 76% 88% 72%
Masstree-Idle 100% 100% 100%
Shore-Union 22% 19% 48%
Shore-Multiply 81% 84% 78%
Shore-Wordcount 22% 26% 47%
Shore-Sort 28% 41% 59%
Shore-MD5 40% 35% 76%
Shore-Idle 100% 100% 100%
(3) Accuracy assessment
The accuracy of the performance interference quantification model under different conditions is evaluated by calculating an evaluation index R, wherein R represents the output of the performance interference model, namely the Pearson correlation coefficient of the native system-level entropy SLE _ o and the application-level index tail average ratio l, as shown in formula (16). If R is more than or equal to 0.6, namely SLE _ o is forward and is strongly related to the tail-to-average ratio l, the accuracy of the performance interference quantification model is acceptable, and the effectiveness of the measurement index can be ensured;
Figure BDA0003878538610000231
table 15 shows specific numerical values of the evaluation index R in different cases. It can be seen that R in all cases is positive and greater than 0.6, and belongs to strong forward correlation, where the maximum value is s 3 Precision of upper Imgdnn related load (0.95), minimum s 2 Upper Imgdnn dependent load and s 3 The accuracy of the upper Shore related load (0.90) indicates that the performance interference quantification model can objectively and accurately quantify the performance interference under various mixed load classes and systems. Although s is 3 And s 1 、s 2 In contrast, the step of extracting the feature index, which is the source of the model input, is a general method, and is suitable for different processor architectures. The accuracy of the performance quantification model is not affected by systematic differences.
Table 15 model accuracy
Mixed part load category s 1 s 2 s 3
Imgdnn correlation 0.93 0.90 0.95
Masstree correlation 0.94 0.94 0.93
Shore correlation 0.93 0.91 0.90
4) Extensibility
4.1 Different online service QPS settings
And for each type of mixed load, respectively changing the QPS size of the online service to evaluate the accuracy of the performance interference quantification model. The principle of default QPS setting is full load without overload, allocating full CPU cores without exceeding QoS threshold, thus setting 50% and 25% of default QPS size as medium and low level QPS, specific values such as table 16.
TABLE 16 QPS configuration
Mixed part load category Default QPS Intermediate level QPS Low level QPS
Imgdnn correlation 1100 550 275
Masstree correlation 550 275 138
Shore correlation 5 3 1
Table 17 shows the values of the evaluation index R for different QPS, where R for the overall QPS is calculated together for the data of the default, medium and low QPS. It can be seen that R in all cases is positive and greater than 0.6, and belongs to strong forward correlation, where the maximum value is s 2 Accuracy of upper Shore-related mixture loading at lower level QPS (0.99) with minimum value of s 2 The accuracy of the Masstree-related hybrid load at the overall QPS (0.65), which indicates that the accuracy of the performance interference quantification model is not affected by the QPS of online service in hybrid load. Because the performance interference quantification model utilizes the disorder degree of the resource competition of the system layer to depict the performance interference degree, once the QPS is changed, the disorder degree of the resource competition is also reflected, and the model can capture the change, so the model can accurately quantify the performance interference under different QPS.
TABLE 17 model accuracy under different QPS
Figure BDA0003878538610000241
Figure BDA0003878538610000251
4.2 Ad-hoc offline analysis jobs
In order to evaluate the accuracy of the performance interference quantification model in the ad-hoc ad hoc scene, a new off-line analysis job FFT is introduced as the ad-hoc load. The FFT comes also from the benchmark suite DCMIX, whose field of application is HPC, whose function is to compute the fast fourier transform of the input matrix file and to write the result to the output file. Mtx in the Martix Market with the size of 400MB as s in this embodiment 1 And s 2 The input file of (1); 10MB size of wi2010.Mtx as s 3 The input file of (1).
Table 18 shows specific values of the evaluation index R under the introduction of ad-hoc ad hoc offline analysis work. It can be seen that R in all cases is positive and greater than 0.6, and belongs to strong forward correlation, where the maximum value is s 3 Precision of the upper Imgdnn-related mix-part load (0.97), minimum s 1 The accuracy of the upper Shore related mixture load (0.80), which indicates that the accuracy of the performance disturbance quantification model is not affected by the ad hoc off-line analysis operation. This is because the workload is uniform at the system level despite differences in computational logic, code implementation, and application level workloads. The invention constructs a uniform competitive resource set by using an ISA instruction driving method, extracts representative and complete resource behavior indexes, and selects all potential performance interference factors, so that the model can accurately quantify the performance interference under the condition of introducing ad-hoc ad hoc on-line analysis operation.
Table 18 model accuracy for ad hoc off-line analysis
Mixed part load category System A System B System C
Imgdnn correlation 0.94 0.96 0.97
Masstree correlation 0.85 0.81 0.82
Shore correlation 0.80 0.90 0.88
(5) Applications of SLE
5.1 SLE based mixed load performance interference affinity analysis
Based on table 13, for each blend load, its SLE averaged over all systems was calculated as the affinity of the blend solution for performance interference. Figure 3 gives a SLE-based performance interference affinity thermodynamic diagram. Wherein, the higher the SLE of the blend-in scheme, the darker the color, indicating that the performance of the scheme interferes with the poor affinity. It can be seen from the figure that: a) Shore-Union possessed the worst performance interference affinity, being the darkest in all protocols. Since both the online service Shore and the offline analysis job Union belong to the disk IO sensitive load, the mixed part between the sensitive loads of the same resource should be avoided; b) From the online service perspective, shore is the most performance-perturbed victim, and 4 of the 5 blend schemes are darker colors than Imgdnn and Masstree. Due to the hierarchical nature of the storage resources, once strong resource contention occurs on the disk IO, the victim load will take more overhead to compensate. Therefore, the disk IO sensitive online service is susceptible to severe performance interference; c) From the offline analysis job perspective, it has the worst affinity to online services running in Wordcount and Sort mixes. From load logic analysis, sort is a mixed resource sensitive load with obvious behaviors on computing, storage and disk IO resources. Similarly, for Wordcount, it first segments the input file into the memory, calculates the word frequency separately with thread as unit, and finally writes into the output file. Therefore, the offline analysis job of the mixed resource sensitive type is not suitable for the mixed portion mode.
5.2 SLE based isolation technology effectiveness assessment
2 typical isolation techniques were evaluated: and comparing the CPU binding technology and the Docker container technology with the SLE value under the default non-isolation technology so as to evaluate the degree of performance interference control.
From the conclusion of the upper limit part of the system performance in this embodiment, s 2 Is moderately efficient and is therefore chosen as a system for the evaluation of the effectiveness of the isolation technique. For the setting of CPU binding technique, s is used 2 The method comprises the steps that 12 CPU cores are possessed, CPU 0-CPU 1 are bound to an online service, and CPU 2-CPU 11 are bound to an offline analysis operation; for the Docker vessel technology setup, at s 2 And deploying offline analysis operation in the last independent container.
As shown in table 19, compared to the non-isolated technique, the CPU binding technique reduces SLE values by 8.47% in all mixed loads on average, and reduces SLE values by maspure-union 22.48% at maximum; the Docker vessel technique reduced the SLE value by 7.16% on average in all mix-out loads, and by a maximum of Masstree-Union 20.16%. Thus, at s 2 In the above way, the CPU binding technology more effectively alleviates the performance interference of the mixed load, and achieves a higher performance interference control degree.
Table 19 SLE comparisons under different isolation techniques
Figure BDA0003878538610000261
Figure BDA0003878538610000271
Please refer to fig. 4, which is a schematic diagram of an interference quantification system according to an embodiment of the present application. The interference quantification system comprises a processor and a memory for storing a computer program which, when executed by the processor, implements the interference quantification method described above.
The processor may be a Central Processing Unit (CPU). The Processor may also be other general purpose Processor, digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or a combination thereof.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods of the embodiments of the present invention. The processor executes the non-transitory software programs, instructions and modules stored in the memory, so as to execute various functional applications and data processing of the processor, that is, to implement the method in the above method embodiment.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor, and the like. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be coupled to the processor via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
An embodiment of the present application further provides a computer-readable storage medium for storing a computer program, which when executed by a processor, implements the interference quantification method described above.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. A method for quantifying interference for mixed loads, wherein the mixed loads comprise online loads and offline loads, and wherein the method comprises:
aiming at a characteristic index of competitive resources, under the condition that the online load and the offline load operate together, acquiring first values of the characteristic index at a plurality of time points, wherein the competitive resources comprise hardware resources which are needed to be used when the online load and the offline load operate; and
calculating the discrete degree among the first values to obtain an interference entropy value, and quantizing the resource competition degree of the mixed load through the interference entropy value, wherein the resource competition degree refers to the mutual interference degree when the online load and the offline load use the competition resources.
2. The method of claim 1, wherein the contention resources comprise a plurality of characteristic indicators;
the calculating a degree of dispersion between a plurality of the first values to obtain an interference entropy value comprises:
respectively calculating the discrete degree among the first values of each characteristic index to obtain a first intermediate entropy value corresponding to each characteristic index;
and obtaining the interference entropy value based on the first intermediate entropy value of the characteristic index.
3. The method of claim 2, wherein the method further comprises:
under the condition that the online load operates alone, second values of the characteristic index at a plurality of time points are obtained, and under the condition that the offline load operates alone, third values of the characteristic index at a plurality of time points are obtained;
the obtaining the interference entropy value based on the first intermediate entropy value of the characteristic indicator includes:
respectively calculating the discrete degree between the plurality of second values of each characteristic index to obtain a second intermediate entropy value corresponding to each characteristic index, and respectively calculating the discrete degree between the plurality of third values of each characteristic index to obtain a third intermediate entropy value corresponding to each characteristic index;
taking the minimum value of the second intermediate threshold value and the third intermediate threshold value as an entropy threshold value;
in the characteristic indexes, determining target characteristic indexes of which the values of first intermediate entropy values are larger than the entropy value threshold value, and respectively carrying out normalization calculation on the first intermediate entropy values of each target characteristic index to obtain optimized entropy values respectively corresponding to the target characteristic indexes;
and according to the weight of the optimized entropy of each target characteristic index, multiplying the optimized entropy of each target characteristic index by the corresponding weight and then adding, and taking the result obtained by adding as the interference entropy.
4. The method of claim 3, wherein for any of the target feature indicators, the weight of the optimized entropy value for the target feature indicator is determined based on:
under the condition that the online load and the offline load operate together, detecting the delay performance of the online load, wherein the delay performance represents the speed of the online load from the request of using the competitive resource to the receiving of the response;
and carrying out correlation calculation on the first value of the target characteristic index and the delay performance in the online load operation process, and taking the calculated value as the weight of the optimized entropy value of the target characteristic index.
5. The method of claim 3, wherein said separately normalizing said first intermediate entropy value for each of said target feature indicators comprises:
and for any target characteristic index, taking the sum of the second intermediate entropy value and the third intermediate entropy value of the target characteristic index as a reference, and performing normalization calculation on the first intermediate entropy value of the target characteristic index.
6. The method of claim 1, wherein the hybrid load is operating in a system configured to accommodate hardware resources required during operation of the hybrid load, wherein in a case where the hybrid load is operating in a plurality of systems, respectively, the method further comprises:
determining an interference entropy value of the mixed part load when the mixed part load operates in each system;
and carrying out normalization calculation on the interference entropy values of the mixed part load when the mixed part load runs in each system to obtain second interference entropy values corresponding to the mixed part load in each system, so that the resource competition degree of the mixed part load when the mixed part load runs in each system is compared through the second interference entropy values.
7. The method of claim 6, wherein in the plurality of systems, the number of characteristic measures differs for at least two systems;
the normalization calculation of the interference entropy value of the mixed part load when the mixed part load runs in each system comprises the following steps:
respectively determining the quantity of the characteristic indexes in each system aiming at the systems, and respectively carrying out normalization calculation on the quantity of the characteristic indexes in the systems by taking the maximum quantity of the characteristic indexes as a reference to obtain a first normalization coefficient corresponding to each system;
for any system, the first normalization coefficient corresponding to the system is multiplied by the interference entropy value of the mixed part load when the mixed part load operates in the system, so that the interference entropy value of the mixed part load when the mixed part load operates in the system is normalized and calculated.
8. The method of claim 1, wherein in the presence of a plurality of the mix portion loads, the method further comprises:
respectively determining the interference entropy value when each mixed part load operates;
detecting a delay performance of the online load in each of the mixed loads, the delay performance characterizing how fast the online load is from requesting use of the contention resource to receiving a response;
for any mixed part load, determining a mapping degree between an interference entropy value and delay performance of the mixed part load, wherein the mapping degree represents the variation of the delay performance per unit value of variation of the interference entropy value;
respectively carrying out normalization calculation on the mapping degrees of the mixed part loads by taking the mapping degree with the largest value as a reference to obtain a second normalization coefficient;
for any mixed part load, multiplying the second normalization coefficient of the mixed part load by the interference entropy value to perform normalization calculation on the interference entropy value of the mixed part load to obtain a first interference entropy value corresponding to the mixed part load;
and comparing the resource competition degrees among the mixed part loads based on the first interference entropy values corresponding to the mixed part loads.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium is used for storing a computer program which, when executed by a processor, implements the method of any one of claims 1 to 8.
10. An interference quantification system, characterized in that the interference quantification system comprises a processor and a memory for storing a computer program which, when executed by the processor, implements the method according to any one of claims 1 to 8.
CN202211222330.6A 2022-07-29 2022-10-08 Interference quantification method and system for mixed part load Pending CN115509758A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2022109097030 2022-07-29
CN202210909703 2022-07-29

Publications (1)

Publication Number Publication Date
CN115509758A true CN115509758A (en) 2022-12-23

Family

ID=84508617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211222330.6A Pending CN115509758A (en) 2022-07-29 2022-10-08 Interference quantification method and system for mixed part load

Country Status (1)

Country Link
CN (1) CN115509758A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089021A (en) * 2023-04-10 2023-05-09 北京大学 Deep learning-oriented large-scale load mixed part scheduling method, device and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089021A (en) * 2023-04-10 2023-05-09 北京大学 Deep learning-oriented large-scale load mixed part scheduling method, device and medium

Similar Documents

Publication Publication Date Title
Shelepov et al. HASS: A scheduler for heterogeneous multicore systems
JP2720910B2 (en) Apparatus and method for managing workload of a data processing system
US7802236B2 (en) Method and apparatus for identifying similar regions of a program's execution
Xu et al. Cache contention and application performance prediction for multi-core systems
US20130339973A1 (en) Finding resource bottlenecks with low-frequency sampled data
CN109542606B (en) Optimization method of EAS (electronic article surveillance) scheduler for wearable device application
Mück et al. Run-DMC: Runtime dynamic heterogeneous multicore performance and power estimation for energy efficiency
CN115167984B (en) Virtual machine load balancing placement method considering physical resource competition based on cloud computing platform
US20180246767A1 (en) Optimized thread scheduling on processor hardware with performance-relevant shared hardware components
Jahre et al. GDP: Using dataflow properties to accurately estimate interference-free performance at runtime
US8910189B2 (en) Methods and systems for automatically determining configuration parameters
CN115509758A (en) Interference quantification method and system for mixed part load
Hiebel et al. Machine learning for fine-grained hardware prefetcher control
Souza et al. Hybrid resource management for HPC and data intensive workloads
CN113158435B (en) Complex system simulation running time prediction method and device based on ensemble learning
Rao et al. Online measurement of the capacity of multi-tier websites using hardware performance counters
Kocoloski et al. Varbench: An experimental framework to measure and characterize performance variability
Cusack et al. Escra: Event-driven, sub-second container resource allocation
Liu et al. A study on modeling and optimization of memory systems
CN116360921A (en) Cloud platform resource optimal scheduling method and system for electric power Internet of things
Kiselev et al. The energy efficiency evaluating method determining energy consumption of the parallel program according to its profile
Wu et al. HW3C: a heuristic based workload classification and cloud configuration approach for big data analytics
US9541979B2 (en) Method for determining an optimal frequency for execution of a software application
US20060080660A1 (en) System and method for disabling the use of hyper-threading in the processor of a computer system
Rao et al. CoSL: A coordinated statistical learning approach to measuring the capacity of multi-tier websites

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination