CN113780336B - Lightweight cache dividing method and device based on machine learning - Google Patents
Lightweight cache dividing method and device based on machine learning Download PDFInfo
- Publication number
- CN113780336B CN113780336B CN202110851952.4A CN202110851952A CN113780336B CN 113780336 B CN113780336 B CN 113780336B CN 202110851952 A CN202110851952 A CN 202110851952A CN 113780336 B CN113780336 B CN 113780336B
- Authority
- CN
- China
- Prior art keywords
- cache
- program
- programs
- scheme
- dividing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000010801 machine learning Methods 0.000 title claims abstract description 24
- 238000012706 support-vector machine Methods 0.000 claims abstract description 26
- 238000005457 optimization Methods 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims description 30
- 238000000638 solvent extraction Methods 0.000 claims description 21
- 230000008569 process Effects 0.000 claims description 10
- 239000000872 buffer Substances 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 8
- 230000001133 acceleration Effects 0.000 claims description 3
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 238000005315 distribution function Methods 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 abstract description 7
- 230000035945 sensitivity Effects 0.000 abstract description 3
- 238000005192 partition Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000010606 normalization Methods 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004040 coloring Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000004460 liquid liquid chromatography Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The invention discloses a lightweight cache dividing method and device based on machine learning, which comprises the steps of firstly classifying programs by using a support vector machine model, and classifying the programs according to the interference degree of the programs on other programs when the programs occupy the cache and the sensitivity degree of the programs on the cache dividing size; and secondly, performing LLC resource scheduling among divided program categories by applying a Bayesian optimization algorithm, searching an LLC division scheme capable of maximizing system throughput, and finally performing cache division according to the LLC division scheme capable of generating the highest performance. The technical scheme of the invention reduces the dispatching expenditure, can rapidly provide a cache dividing scheme and improves the throughput of the whole system.
Description
Technical Field
The application belongs to the technical field of server cache partitioning, and particularly relates to a lightweight cache partitioning method and device based on machine learning.
Background
The Last Level Cache (LLC) of the currently mainstream desktop-Level multi-core processor typically employs a sharing mechanism, i.e., without limitation, applications running on the processor share the LLC. Traditional cache replacement strategies only focus on the accessed condition of data and do not consider the relationship of the data to a specific application. This means that program data already stored on the LLC may be replaced by the LLC due to data access by other programs. When program data that has been replaced is accessed, a cache miss occurs, causing a conflict for concurrent programs on the LLC. Such conflicts necessarily result in reduced performance of the program, affecting the overall throughput of the system.
The prior art attempts to cache resource partitioning between concurrent applications to improve LLC performance. The basic idea of these works is to isolate LLC resources used by concurrently running applications, thereby mitigating contention and mutual interference of concurrent applications at the cache level. There are two ways to implement LLC resource partitioning, one is a technology that requires complex hardware support, and the other is a pure software technology.
Hardware technology typically relies on complex hardware designs and is not versatile. Page coloring used by software technology does not require hardware support, but has its own drawbacks. For example, page coloring techniques are not compatible with the megapage mechanism. This is because the megapages require a large number of consecutive base pages in virtual memory and physical memory, resulting in all available page colors being occupied. To this end, intel proposes a coarse-grained cache partitioning technique (Cache Allocation Technology, CAT). The advantage of CAT is the ability to dynamically partition LLC rapidly in most Intel commercial CPUs. However, the smallest partition unit of CAT is a way, since LLC way is a coarse-grained partition unit. This may result in excessive or insufficient LLC resources allocated by the program, resulting in reduced program working set performance.
There are various methods for LLC partitioning using CAT technology in the prior art. Some partitioning methods rely on detailed performance data collected by the performance collectors for LLC partitioning. For example, an LLC partitioning scheme is generated based on the cache miss rate of each application in the workload. These methods require monitoring the change in cache miss rate for each application and implementing fine-grained cache partitioning. Such real-time monitoring requires a CPU to afford additional high analysis overhead. Some researchers applied heuristic algorithms to LLC partitioning, resulting in significant partitioning performance. However, heuristic algorithms require continuous "trial and error", i.e., continuous LLC scheduling, to explore a wide search space, possibly resulting in high scheduling overhead. Therefore, a new low-overhead and high-performance cache dividing method needs to be designed.
Disclosure of Invention
The purpose of the application is to provide a lightweight cache dividing method and device based on machine learning, so as to reduce scheduling overhead, rapidly provide a cache dividing scheme and improve the throughput of the whole system.
In order to achieve the above purpose, the technical scheme of the application is as follows:
a lightweight cache dividing method based on machine learning comprises the following steps:
constructing and training a first support vector machine model for distinguishing a strong interference program from a non-strong interference program, and constructing and training a second support vector machine model for distinguishing a cache sensitive program from a cache insensitive program;
classifying programs in the working set by adopting a trained first support vector machine model and a trained second support vector machine model;
the method comprises the steps of taking the number of cache ways occupied by each of a strong interference program, a cache sensitive program and a cache insensitive program as a cache dividing scheme, taking the average weighted acceleration ratio of all application programs in a working set as a black box function of the relation between the cache dividing scheme and system throughput, and predicting the optimal cache dividing scheme by adopting Bayesian optimization.
Further, the black box function f (x) is expressed as:
wherein, IPC shared_i Is the number of instructions per clock cycle when the ith program runs in parallel with other programs, IPC alone_i The IPC of the ith program when running alone, n is the program number in the working set.
Further, the predicting an optimal cache partition scheme by using bayesian optimization includes:
initializing three cache dividing schemes, calculating corresponding black box function values, and starting iteration;
updating the Gaussian process model according to the existing cache dividing scheme and the corresponding black box function value;
adopting Gaussian process gain expectation as an acquisition function, generating a buffer division scheme of the next sample to be explored, and calculating a corresponding black box function value;
and when the preset iteration termination condition is reached, terminating the iteration, outputting a final cache division scheme, and otherwise, returning to continue the iteration.
Further, the initializing three cache dividing schemes includes:
when the number of the cache ways is M, for three types of programs, namely a strong interference program, a cache sensitive program and a cache insensitive program, the program occupies M-2 ways of caches according to one type of program, and the other two types respectively occupy 1 way of caches, so that three cache dividing schemes are generated.
Further, the gain is desirably expressed as:
wherein m (x) and sigma (x) are the estimated mean and estimated mean square error, respectively, of the gaussian process with respect to x,is the current optimum value, ζ is a constant vector that trades off between sampled and non-sampled; furthermore, the->CDF (z) and PDF (z) are a standard normal cumulative distribution function and probability density function of z, respectively.
The application also provides a lightweight cache dividing device based on machine learning, which comprises a processor and a memory storing a plurality of computer instructions, wherein the computer instructions realize the steps of the lightweight cache dividing method based on machine learning when being executed by the processor.
The lightweight cache dividing method and device based on machine learning comprise the steps of firstly classifying programs by using a support vector machine model, and classifying the programs according to interference degrees of the programs on other programs when the programs occupy caches and sensitivity degrees of the programs on cache dividing sizes; and secondly, performing LLC resource scheduling among divided program categories by applying a Bayesian optimization algorithm, searching an LLC division scheme capable of maximizing system throughput, and finally performing cache division according to the LLC division scheme capable of generating the highest performance. The technical scheme of the invention reduces the dispatching expenditure, can rapidly provide a cache dividing scheme and improves the throughput of the whole system.
Drawings
FIG. 1 is a flow chart of a lightweight cache partitioning method based on machine learning;
fig. 2 is a flowchart of a bayesian optimization algorithm according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In recent years, machine learning has been widely used in the fields of computer vision, natural language processing, automation, and the like, because accurate decisions can be realized with lower running overhead. Meanwhile, there is a complex internal relationship between program operation parameters (memory bandwidth, LLC capacity, etc.), cache partitioning scheme and the performance of the application program. Machine learning can successfully build a relational model and accurately infer high-performance partitioning schemes therefrom. Based on the relational model, the optimal cache partitioning scheme can be deduced by only collecting the operation parameters once as the input of the model. Thus, applying machine learning in cache partitioning helps to reduce analysis overhead.
The lightweight cache dividing method based on machine learning provided by the application, as shown in fig. 1, comprises the following steps:
step S1, a first support vector machine model for distinguishing a strong interference program from a non-strong interference program is constructed and trained, and a second support vector machine model for distinguishing a cache sensitive program from a cache insensitive program is constructed and trained.
In order to rapidly provide an LLC partition scheme and improve the throughput of the whole system, the application provides a lightweight cache partition method based on machine learning, wherein the cache partition method firstly uses a support vector machine model to classify application programs, and classifies the application programs according to the interference degree of the application programs on other application programs when the application programs occupy the cache and the sensitivity degree of the application programs on the cache partition size; secondly, performing LLC resource scheduling among the divided program categories by applying a Bayesian optimization algorithm, and searching an LLC division scheme capable of maximizing the system throughput; finally, the division is performed according to the LLC division scheme capable of generating the highest performance.
The method constructs a support vector machine model, performs offline training by using normalized data, and constructs a maximum geometric interval hyperplane for program class division. A Gaussian kernel function is used as a kernel function of the support vector machine, so that the model is nonlinear separable. The present application applies a binary tree structure to three classifications, in which 2 support vector machine models of two classifications are trained, named Classfier-A and Classfier-B, respectively. The Classfier-A is used for distinguishing a strong interference program from a non-strong interference program, and the Classfier-B is used for distinguishing a Cache sensitive program from a Cache insensitive program.
In this embodiment, n specpu 2006 and specpu 2017 applications (n=4, 5, 6) are randomly selected to form a 5000-tuple working set collection training dataset; thereafter, 9 operating parameters of the application are sampled using cache monitoring techniques (Cache Monitor Technology, CMT), memory bandwidth monitoring (Memory Bandwidth Monitor, MBM) and Linux native tools, and specific usage instructions include "pqos-m-p all: pid "," top "and" cat/proc/cpu info|grep 'cpu MHz' ", etc.; the sampling content comprises 9 operation parameters of CPU core frequency, IPC, cache miss, LLC occupation amount, local memory bandwidth, memory footprint, virtual memory, resident memory and actual LLC line occupation number of an application program, wherein the operation parameters form an input feature vector of a support vector machine.
In this embodiment, if a program runs in parallel with omnetpp, astar and xz, the sum of the average performance degradation of omnetpp, astar and xz exceeds 10%, then the program is marked as a strong interference program; if a non-strongly interfering program runs alone, the average performance can rise by more than 10% when its occupied LLC allocates 1-way LLC to all-way LLC, then the program is marked as a Cache sensitive program, otherwise the program is marked as a Cache insensitive program.
In the embodiment, the strong interference program is marked according to the condition that the performance of the background program is reduced when the program and the background program run in parallel; and secondly, marking a Cache sensitive program and a Cache insensitive program based on the average performance rising amplitude of LLCs from 1-way LLCs to all ways when the non-strong interference programs are independently operated. And then cleaning the data set, wherein the data set needs to be preprocessed due to redundancy and interference information in the data set, including data format specification, high-frequency repeated data deletion and the like. For example, if a certain group of data repeatedly appears, only one group of data is reserved, the rest of the same data is deleted, in addition, the data with the missing is not reserved, obvious anomalies are removed, and the data with excessive values higher or lower than the normal values are not reserved. And finally, carrying out normalization processing on the data set, recording the maximum value and the minimum value of each feature, and converting the feature value into a value between 0 and 1 by adopting a min-max normalization method.
The min-max normalization method formula is as follows:
wherein min, max are minimum and maximum values of each feature original Is the original feature value normalized Is the characteristic value after normalization processing.
Offline training is performed by using normalized data, and 2 two-class support vector machine models, named Classfier-A and Classfier-B, are trained. The Classfier-A is used for distinguishing a strong interference program from a non-strong interference program, and the Classfier-B is used for distinguishing a Cache sensitive program from a Cache insensitive program.
In a specific embodiment, the support vector machine of the present application employs a gaussian kernel function, where the formula of the gaussian kernel function is:
wherein x is the spatial position of the eigenvalue, x' is the kernel function center; sigma is the width parameter of the function, and is used for controlling the radial action range of the function; furthermore, e is a natural constant.
And S2, classifying the programs in the working set by adopting the trained first support vector machine model and the trained second support vector machine model.
For a working set needing to be cached and divided, the embodiment adopts a trained support vector machine model to classify programs in the working set, and classifies the programs into the following categories: strong interference program, buffer sensitive program and buffer insensitive program.
Specifically, a first support vector machine model Classfier-A is adopted to classify the programs into strong interference programs or non-strong interference programs, and then a second support vector machine model Classfier-B is adopted to further classify the non-strong interference programs, so that the non-strong interference programs are classified into Cache sensitive programs (Cache sensitive programs) or Cache insensitive programs (Cache insensitive programs).
When classifying the programs in the working set, similar to training the support vector machine model, the 9 operation parameters of the programs need to be sampled to form the input feature vector of the support vector machine, and then normalization processing and the like are needed, which are not described herein.
And S3, taking the number of cache ways occupied by each of the three programs of the strong interference program, the cache sensitive program and the cache insensitive program as a cache dividing scheme, taking the average weighted acceleration ratio of all the application programs in the working set as a black box function of the relation between the cache dividing scheme and the system throughput, and predicting the optimal cache dividing scheme by adopting Bayesian optimization.
The relation between the Cache dividing scheme and the system throughput is a black box function f (x), wherein x represents the number of Cache ways occupied by each of the three types of programs, namely, the strong interference program, the Cache sensitive program and the Cache insensitive program, for example, x= <1,3,5> represents that 1/3/5 Cache ways are respectively allocated to the Cache sensitive program and the Cache insensitive program; the black box function may be represented by WS for all applications in the concurrent workload as follows:
wherein, IPC shared_i Is the instruction number per clock cycle (Instructions Per Clock, IPC) of the ith program running in parallel with other programs, IPC alone_i The IPC of the ith program when independently running, and n is the application program number in the working set; the IPC of each application will be sampled by the CMT performance monitor.
In the Bayesian optimization initialization, the method initializes three cache dividing schemes, including:
when the number of the cache ways is M, for three types of programs, namely a strong interference program, a cache sensitive program and a cache insensitive program, the program occupies M-2 ways of caches according to one type of program, and the other two types respectively occupy 1 way of caches, so that three cache dividing schemes are generated.
For example, the samples in three extreme cases, i.e. 18-way buffers for one class of program and then 1-way buffers for the remaining class 2 programs, are initialized assuming a 20-way buffer number. The three cache partitioning schemes are x= <18,1,1>, x= <1,18,1> and x= <1,1,18>, respectively.
The present embodiment adopts bayesian optimization to predict an optimal cache partition scheme, as shown in fig. 2, including:
initializing three cache dividing schemes, calculating corresponding black box function values, and starting iteration;
updating the Gaussian process model according to the existing cache dividing scheme and the corresponding black box function value;
adopting Gaussian process gain expectation as an acquisition function, generating a buffer division scheme of the next sample to be explored, and calculating a corresponding black box function value;
and when the preset iteration termination condition is reached, terminating the iteration, outputting a final cache division scheme, and otherwise, returning to continue the iteration.
In the iterative Process of Bayesian optimization prediction, a Gaussian Process (GP) model is updated according to the existing set of x and f (x); then taking Gaussian process gain expectations (Expected Improvement, EI) as acquisition functions, and generating a next buffer division scheme x to be explored for sampling according to the mean and covariance of GP estimation; the EI calculation formula is as follows:
wherein m (x) and sigma (x) are the estimated mean and estimated mean square error of GP with respect to x, respectively,is the current optimum value, ζ is a constant vector that trades off between sampled and non-sampled; furthermore, the->CDF (z) and PDF (z) are a standard normal cumulative distribution function and a probability density function of z, respectively; when EI is calculated, GP is called for predicting model precision; EI takes into account the non-sampled cache partitioning scheme and generates an x that may have the highest f (x) value.
The LLC division scheme generated by the EI formula is executed, the corresponding f (x) is obtained, and the GP is updated; and continuing the loop iteration for 20 times, if the value of EI in the period is smaller than 0.001, terminating the loop, and executing the LLC partition scheme generated last time as a final scheme.
According to the technical scheme, LLC resource scheduling is carried out among divided program categories by applying a Bayesian optimization algorithm, an LLC division scheme capable of maximizing system throughput is searched, and finally cache division is carried out according to the LLC division scheme capable of generating the highest performance.
In another embodiment, the application further provides a lightweight cache division device based on machine learning, which comprises a processor and a memory storing a plurality of computer instructions, wherein the computer instructions realize the steps of the lightweight cache division method based on machine learning when being executed by the processor.
For specific limitations regarding the machine learning-based lightweight cache division apparatus, reference may be made to the above limitations regarding the machine learning-based lightweight cache division method, and no further description is given here. The lightweight cache division device based on machine learning can be fully or partially implemented by software, hardware and a combination thereof. May be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor invokes the corresponding operations.
The memory and the processor are electrically connected directly or indirectly to each other for data transmission or interaction. For example, the components may be electrically connected to each other by one or more communication buses or signal lines. The memory stores a computer program that can be executed on a processor that implements the network topology layout method in the embodiment of the present invention by executing the computer program stored in the memory.
The Memory may be, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc. The memory is used for storing a program, and the processor executes the program after receiving an execution instruction.
The processor may be an integrated circuit chip having data processing capabilities. The processor may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), and the like. The methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.
Claims (4)
1. The lightweight cache dividing method based on machine learning is characterized by comprising the following steps of:
constructing and training a first support vector machine model for distinguishing a strong interference program from a non-strong interference program, and constructing and training a second support vector machine model for distinguishing a cache sensitive program from a cache insensitive program;
classifying programs in the working set by adopting a trained first support vector machine model and a trained second support vector machine model;
taking the number of cache ways occupied by each of the three programs, namely a strong interference program, a cache sensitive program and a cache insensitive program, as a cache dividing scheme, taking the average weighted acceleration ratio of all the application programs in a working set as a black box function of the relation between the cache dividing scheme and the system throughput, and adopting Bayesian optimization to predict the optimal cache dividing scheme;
the method for predicting the optimal cache dividing scheme by adopting Bayesian optimization comprises the following steps:
initializing three cache dividing schemes, calculating corresponding black box function values, and starting iteration;
updating the Gaussian process model according to the existing cache dividing scheme and the corresponding black box function value;
adopting Gaussian process gain expectation as an acquisition function, generating a buffer division scheme of the next sample to be explored, and calculating a corresponding black box function value;
terminating iteration when a preset iteration termination condition is reached, outputting a final cache division scheme, otherwise returning to continue iteration;
the gain expectations are expressed as:
wherein m (x) and sigma (x) are the estimated mean and estimated mean square error, respectively, of the gaussian process with respect to x,is the current optimum value, ζ is a constant vector that trades off between sampled and non-sampled; furthermore, the->CDF (z) and PDF (z) are a standard normal cumulative distribution function and probability density function of z, respectively, and x represents a cache division scheme.
2. The machine learning based lightweight cache division method as claimed in claim 1, wherein the black box function f (x) is expressed as:
wherein, IPC shared_i Is the number of instructions per clock cycle when the ith program runs in parallel with other programs, IPC alone_i The IPC of the ith program when independently running is that n is the number of programs in a working set, and x represents the number of cache ways occupied by the three types of programs, namely the strong interference program, the cache sensitive program and the cache insensitive program.
3. The machine learning based lightweight cache partitioning method of claim 1, wherein said initializing three cache partitioning schemes comprises:
when the number of the cache ways is M, for three types of programs, namely a strong interference program, a cache sensitive program and a cache insensitive program, the program occupies M-2 ways of caches according to one type of program, and the other two types respectively occupy 1 way of caches, so that three cache dividing schemes are generated.
4. A machine learning based lightweight cache dividing device comprising a processor and a memory storing a number of computer instructions, wherein the computer instructions when executed by the processor implement the steps of the method of any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110851952.4A CN113780336B (en) | 2021-07-27 | 2021-07-27 | Lightweight cache dividing method and device based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110851952.4A CN113780336B (en) | 2021-07-27 | 2021-07-27 | Lightweight cache dividing method and device based on machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113780336A CN113780336A (en) | 2021-12-10 |
CN113780336B true CN113780336B (en) | 2024-02-02 |
Family
ID=78836399
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110851952.4A Active CN113780336B (en) | 2021-07-27 | 2021-07-27 | Lightweight cache dividing method and device based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113780336B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105519030A (en) * | 2014-04-30 | 2016-04-20 | 华为技术有限公司 | Computer program product and apparatus for fast link adaptation in a communication system |
CN106708626A (en) * | 2016-12-20 | 2017-05-24 | 北京工业大学 | Low power consumption-oriented heterogeneous multi-core shared cache partitioning method |
CN112000465A (en) * | 2020-07-21 | 2020-11-27 | 山东师范大学 | Method and system for reducing performance interference of delay sensitive program in data center environment |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11062232B2 (en) * | 2018-08-01 | 2021-07-13 | International Business Machines Corporation | Determining sectors of a track to stage into cache using a machine learning module |
-
2021
- 2021-07-27 CN CN202110851952.4A patent/CN113780336B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105519030A (en) * | 2014-04-30 | 2016-04-20 | 华为技术有限公司 | Computer program product and apparatus for fast link adaptation in a communication system |
CN106708626A (en) * | 2016-12-20 | 2017-05-24 | 北京工业大学 | Low power consumption-oriented heterogeneous multi-core shared cache partitioning method |
CN112000465A (en) * | 2020-07-21 | 2020-11-27 | 山东师范大学 | Method and system for reducing performance interference of delay sensitive program in data center environment |
Also Published As
Publication number | Publication date |
---|---|
CN113780336A (en) | 2021-12-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11176487B2 (en) | Gradient-based auto-tuning for machine learning and deep learning models | |
JP7366274B2 (en) | Adaptive search method and device for neural networks | |
Zhu et al. | Semi-supervised streaming learning with emerging new labels | |
Dublish et al. | Poise: Balancing thread-level parallelism and memory system performance in GPUs using machine learning | |
Genender-Feltheimer | Visualizing high dimensional and big data | |
CN117077871B (en) | Method and device for constructing energy demand prediction model based on big data | |
WO2021040584A1 (en) | Entity and method performed therein for handling computational resources | |
CN115495095A (en) | Whole program compiling method, device, equipment, medium and cluster of tensor program | |
CN113780336B (en) | Lightweight cache dividing method and device based on machine learning | |
CN111930484B (en) | Power grid information communication server thread pool performance optimization method and system | |
Sun et al. | Particle swarm algorithm: convergence and applications | |
CN111860818B (en) | SOM neural network algorithm processing method based on intelligent chip | |
Sirotković et al. | Accelerating mean shift image segmentation with IFGT on massively parallel GPU | |
Qiu et al. | Machine-learning-based cache partition method in cloud environment | |
CN113434286A (en) | Energy efficiency optimization method suitable for mobile application processor | |
CN115812199A (en) | Hyperspace-based processing of datasets for Electronic Design Automation (EDA) applications | |
CN114356418B (en) | Intelligent table entry controller and control method | |
WO2023030227A1 (en) | Data processing method, apparatus and system | |
EP4242837A1 (en) | Data processing apparatus and method | |
Huang et al. | Learning to Drive Software-Defined Storage | |
Zhou et al. | IOMeans: Classifying Multi-concurrent I/O Threads Using Spatio-Tempo Mapping | |
Nguyen et al. | High resolution self-organizing maps | |
Yayah et al. | Parallel classification and optimization of telco trouble ticket dataset | |
Mustapha et al. | Research Article Evaluation of Parallel Self-organizing Map Using Heterogeneous System Platform | |
CN118194072A (en) | Ultra-large-scale data-oriented pellet clustering method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |