CN113360192A - Thermal cache identification method and device, storage medium and electronic equipment - Google Patents
Thermal cache identification method and device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN113360192A CN113360192A CN202010149714.4A CN202010149714A CN113360192A CN 113360192 A CN113360192 A CN 113360192A CN 202010149714 A CN202010149714 A CN 202010149714A CN 113360192 A CN113360192 A CN 113360192A
- Authority
- CN
- China
- Prior art keywords
- cache
- task
- byte number
- residual threshold
- processor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 230000015654 memory Effects 0.000 claims description 49
- 238000004590 computer program Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 10
- 238000004891 communication Methods 0.000 description 5
- 238000005457 optimization Methods 0.000 description 5
- 230000000717 retained effect Effects 0.000 description 2
- 230000004622 sleep time Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 208000019116 sleep disease Diseases 0.000 description 1
- 230000036578 sleeping time Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/4401—Bootstrapping
- G06F9/4418—Suspend and resume; Hibernate and awake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44568—Immediately runnable code
- G06F9/44578—Preparing or optimising for loading
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The embodiment of the application discloses a method and a device for identifying a thermal cache, a storage medium and an electronic device, wherein the method comprises the following steps: when a task is awakened, acquiring the loss times of running data corresponding to the task in a cache during the sleep period of the task; obtaining a residual threshold value of the running data in the cache; and if the loss times are less than the residual threshold value, determining the cache as hot cache. Therefore, by adopting the embodiment of the application, the hot cache is judged by counting the loss times of the running data in the cache during the sleep period of the task, the loss times are objective and accurate, and the accuracy of judging the hot cache is improved.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying a thermal cache, a storage medium, and an electronic device.
Background
When a task is awakened again after sleeping, if the sleeping time is lower than a certain time threshold, it is determined that the processor on which the task runs before sleeping is a hot cache, the task may be preferentially put on the processor to continue running, and if the task is scheduled to run on the rest processors in the system, the task related running data needs to be loaded into the cache again from a memory (such as a dynamic memory or a virtual memory). Therefore, the execution time and the power consumption of the running processor are better when the running processor is placed on the original processor.
Such an optimization is based on the assumption that if the sleep time is below a certain time threshold, there is a high probability that the running data of the task will remain in the cache, rather than being retained. The time threshold is difficult to set to a suitable value, resulting in insufficient accuracy in identifying whether the task is a hot cache on the original processor.
Disclosure of Invention
The embodiment of the application provides a method and a device for identifying a thermal cache, a storage medium and electronic equipment, wherein the thermal cache is judged by counting the loss times of running data in the cache during the sleep period of a task, the loss times are objective and accurate, and the accuracy of identifying the thermal cache on a processor on which the task runs before the sleep is improved.
The technical scheme is as follows:
in a first aspect, an embodiment of the present application provides a method for identifying a hot cache, where the method includes:
when a task is awakened, acquiring the loss times of running data corresponding to the task in a cache during the sleep period of the task;
obtaining a residual threshold value of the running data in the cache;
and if the loss times are less than the residual threshold value, determining the cache as hot cache.
In a second aspect, an embodiment of the present application provides an apparatus for identifying a thermal cache, where the apparatus includes:
the loss times acquisition module is used for acquiring the loss times of the running data corresponding to the task in the cache during the sleep period of the task when the task is awakened;
a residual threshold obtaining module, configured to obtain a residual threshold of the operating data in the cache;
and if the loss times are less than the residual threshold value, determining the cache as hot cache.
In a third aspect, embodiments of the present application provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.
In a fourth aspect, an embodiment of the present application provides an electronic device, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.
The beneficial effects brought by the technical scheme provided by some embodiments of the application at least comprise:
in the embodiment of the application, when a task is awakened, the number of times of losing the running data corresponding to the task in the cache during the sleep period of the task is acquired, the residual threshold value of the running data in the cache is acquired, and if the number of times of losing is smaller than the residual threshold value, the hot cache is determined. The hot cache is judged by counting the loss times of the running data in the cache during the sleep period of the task, the loss times are objective and accurate, and the accuracy of identifying that the running data of the task is the hot cache on the processor before the sleep is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for identifying a hot cache according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating an example of task sleep time provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of a cache and a processor according to an embodiment of the present disclosure;
fig. 4 is a schematic flowchart of a method for identifying a hot cache according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a cache and a processor according to an embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram of a multi-CPU system according to an embodiment of the present application;
FIG. 7 is a diagram illustrating a system memory according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a thermal cache identification apparatus according to an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of a thermal cache identification apparatus according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
In the description of the present application, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art. Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
The following describes in detail a hot cache identification method provided by an embodiment of the present application with reference to fig. 1 to 7. The method may be implemented in dependence on a computer program, operable on a von neumann based thermal cache identification device. The computer program may be integrated into the application or may run as a separate tool-like application. The hot cache identification device in the embodiment of the present application may be a user terminal, where the user terminal includes but is not limited to: a smartphone, personal computer, tablet, handheld device, wearable device, computing device, or other processing device connected to a wireless modem, and the like.
Fig. 1 is a schematic flow chart of a method for identifying a hot cache according to an embodiment of the present disclosure. As shown in fig. 1, the method of the embodiment of the present application may include the steps of:
s101, when a task is awakened, obtaining the loss times of running data corresponding to the task in a cache during the sleep period of the task;
it is to be understood that a task may be viewed as a thread. At least one thread may be involved in a process in general. The thread can utilize the resources owned by the process, and in an operating system introducing the thread, the process is generally used as a basic unit for allocating the resources, and the thread is used as a basic unit for independently running and independently scheduling. Because the thread is smaller than the process and basically does not have system resources, the overhead for scheduling the thread is smaller, and the concurrent execution degree among a plurality of programs of the system can be more efficiently improved.
Each task is in one of 5 states of a sleep state, a ready state, a running state, a suspended state (waiting for some event to occur) and an interrupted state, and the running state of each task is the same as that of the process to which the task belongs. In this embodiment of the present application, the running state corresponding to the task is a sleep state and enters a ready state, that is, the task is a task to be awakened.
Before the task enters the sleep state, if the task is interrupted and preempted in the executing process, the task executes the sleep operation and enters the sleep state.
Specifically, when the task is awakened again, the number of times of loss or hit of the running data, information, and the like of the task in the cache during the continuous sleep period before the task is awakened can be counted through a Performance Monitor Unit (PMU) integrated inside an ARM chip of the system. Where PMUs may count relevant data from the most basic processor instruction level.
As multiple times of sleep may exist during the period from the start to the end of the same task, as shown in fig. 2, the task is started from T0, and is ended at T1, and is the first sleep during the period from T1 to T2, and is the second sleep during the period from T3 to T4, if the current wake-up time is T4, the counted cache loss times are the cache loss times at the time from T3 to T4.
The loss means that a certain read instruction cannot find the corresponding content to be read from the cache, and if the certain read instruction is lost, a certain line in the original cache needs to be discarded. When reading a row, if not, the number of times of loss is + 1.
Optionally, the cache includes an exclusive cache and/or a shared cache. The exclusive cache is exclusive to each CPU in the CPUs contained in the system, and the shared cache is shared by the CPUs contained in the system. When the running data is only located in an exclusive cache, acquiring the number of times of losing the running data corresponding to the task in the exclusive cache during the task sleep period; or, when the running data is only located in a shared cache, acquiring the number of times of loss of the running data corresponding to the task in the shared cache during the sleep period of the task; or, when the running data is located in an exclusive cache and a shared cache, acquiring a first loss frequency of the running data corresponding to the task in the exclusive cache and a second loss frequency of the running data corresponding to the task in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as the loss frequency.
It should be noted that the cache is a temporary data exchanger between the CPU of the processor and the memory, and is used for solving the contradiction between the operating processing speed of the CPU and the read-write speed of the memory, and the cache speed is faster than the memory speed. The cache is generally integrated directly with the CPU chip or on a separate chip interconnected with the motherboard bus, and the cache at this stage is generally integrated directly on the CPU, as shown in fig. 3. The CPU usually needs to repeatedly process the same data and repeatedly execute the same instruction, and if the data and the instruction can be found in the cache, the CPU does not need to read the data and the instruction from the memory or the hard disk, so that the response time of the whole machine is reduced.
The CPU is an operation core and a control core of a computer and is a final execution unit for information processing and program operation. The CPU includes an arithmetic logic unit, a register unit, a control unit, and the like, and has functions of processing instructions, performing operations, controlling time, processing data, and the like. The performance of a CPU is mainly reflected in the speed at which it runs programs. The performance indexes affecting the operating speed include parameters such as the operating frequency of the CPU, the cache capacity, the instruction system and the logic structure.
S102, obtaining a residual threshold value of the running data in the cache;
and the residual threshold is a critical value which is updated and replaced after the running data of the target task in the cache is lost.
Assuming that the size of the cache is M bytes, and the size of each line of the cache is L bytes, the cache comprises M/L, and after M/L losses, the cache content is considered to be completely updated. If the remaining 50% of the information content is used as the threshold point, the remaining threshold is 50% x M/L.
Optionally, when the operation data is only located in an exclusive cache, obtaining a total number of bytes of the operation data in the exclusive cache and a number of row bytes of each row in the exclusive cache, calculating a quotient of the total number of bytes and the number of row bytes, and determining a product of the quotient and a preset ratio as a residual threshold of the operation data in the cache. When the operation data is only located in a shared cache, acquiring the total byte number of the operation data in the shared cache and the row byte number of each row in the shared cache, calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as the residual threshold value of the operation data in the cache. When the operation data are located in an exclusive cache and a shared cache, acquiring a first total byte number of the operation data in the exclusive cache and a first row byte number of each row in the exclusive cache, calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operation data in the exclusive cache; acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache, calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache; and determining the sum of the product of the first residual threshold and the first weight and the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache.
S103, if the lost times are smaller than the residual threshold value, determining that the task is a hot cache on a processor, wherein the processor is the processor on which the task runs before sleeping.
If the number of cache losses is lower than the residual threshold, it is indicated that most of the running data of the task is retained in the cache, and the running data does not need to be read from the memory into the cache but can be directly run based on the cached data, and it can be determined that the task is a hot cache hot on the processor running before the sleep.
In the embodiment of the application, when a task is awakened, the number of times of losing the running data corresponding to the task in the cache during the sleep period of the task is acquired, the residual threshold value of the running data in the cache is acquired, and if the number of times of losing is smaller than the residual threshold value, the hot cache is determined. The hot cache is judged by counting the loss times of the running data in the cache during the sleep period of the task, the loss times are objective and accurate, and the accuracy of identifying that the running data of the task is the hot cache on the processor before the sleep is improved.
Fig. 4 is a schematic flow chart of a method for identifying a hot cache according to an embodiment of the present disclosure. The difference between the embodiment shown in fig. 4 and the embodiment shown in fig. 1 is that the process of identifying a task as a hot cache is described by taking an example that a cache corresponding to the task includes an exclusive cache and a shared cache (that is, a part of running data corresponding to the task is stored in the exclusive cache and a part of the running data is stored in the shared cache), and when the task is confirmed as a hot cache, the running CPU running before sleep is used to run the task, and when the task is confirmed not to be a hot cache, the running data is read from the memory and then the task is run. The hot cache identification method can comprise the following steps:
s201, when a task is awakened, acquiring a first loss frequency of running data corresponding to the task in the exclusive cache and a second loss frequency in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as a loss frequency;
the CPU includes three levels of caches L1, L2, and L3, where L1 is a first level cache, L2 is a second level cache, and L3 is a third level cache, which are all caches integrated in the CPU, and they all function as a high-speed data buffer between the CPU and the main memory, as shown in fig. 5. L1 is closest to the CPU core; l2 second; l3 again. In the aspect of operation speed: l1 fastest, L2 times faster, L3 slowest; capacity size: l1 smallest, L2 larger, L3 largest.
The L1 is an exclusive cache, that is, unique to each CPU, if the system has multiple CPUs, then there are multiple L1, and L2 and L3 are shared caches, for example, for an 8-core CPU, if the system is divided into 2 groups of CPUs according to the performance optimization and the power consumption optimization, and each group has 4 CPUs, then the 2 groups of CPUs respectively correspond to one L2, one of which is shared by 4 CPUs with the power consumption optimization, and the other is shared by 4 CPUs with the performance optimization. L3 is shared by all CPUs in the system, and again taking an 8-core CPU as an example, these 8 CPUs share one L3.
When a part of the running data of the task is cached in the exclusive cache L1 and another part is cached in the shared cache L2, the CPU searches the fastest L1 for the data and then searches the second fastest L2 or L3DE for the running data.
Specifically, when a task is woken up again, a PMU integrated inside an ARM chip of the system may count a first loss number n1 of running data, information, and the like of the task in the exclusive cache L1 and a second loss number n2 in the shared cache L2 and/or L3 during a continuous sleep period before the task is woken up, so that the loss number n is n1+ n 2. A miss means that a read instruction fails to find the corresponding contents to be read from the cache, and if so, means that the original line in the cache needs to be discarded. When reading a row, if not, the number of times of loss is + 1. Where PMUs may count relevant data from the most basic processor instruction level.
S202, acquiring a first total byte number of the operating data in the exclusive cache and a first row byte number of each row in the exclusive cache;
if a first total byte number M1 in the running data is cached in the L1, the byte number of the first row included in each row in the L1 is N1.
It should be noted that reading the run data from the buffer is performed row by row.
S203, calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operating data in the exclusive cache;
the first predetermined ratio is X1 and may be predetermined, such as 50%.
The first residual threshold C1 ═ M1/N1 × X1.
S204, acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache;
if the first total byte number M2 in the running data is cached in L2/L3, the byte number of the first row included in each row in L2/L3 is N2.
S205, calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache;
the second predetermined ratio is X2 and may be predetermined, such as 50%.
The second residual threshold C2 ═ M2/N2 × X2.
S206, determining the product of the first residual threshold and the first weight and the sum of the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache;
if the first weight is a1 and the second weight is a2, which may be preset, the residual threshold C is a1 × C1+ a2 × C2.
S207, if the lost times are smaller than the residual threshold, determining that the task is a hot cache on a processor, and scheduling the task to the processor for running, wherein the processor is the processor on which the task runs before sleeping;
if the cache loss times are lower than the residual threshold value C, it indicates that the running data does not need to be read from the memory into L1 and L2/L3 and can be directly run based on the cached data, and the cache hot can be determined.
In addition, the embodiment of the application is applied to a multi-CPU system. multi-CPU Systems typically include four different forms, namely Multiprocessor Systems, Multicomputer Systems, Network Systems and Distributed Systems.
The multi-CPU system comprises two or more processors with similar functions, the processors can exchange data with each other, all the processors share a memory, I/O equipment, a controller and external equipment, the whole hardware system is controlled by a unified operating system, and all levels of operation, tasks, programs, arrays and elements are comprehensively parallel between the processors and the programs.
For example, as shown in fig. 6, a schematic structural diagram of a multi-CPU system is shown, where the system includes multiple CPUs sharing a memory. The Multi-CPU system may be a symmetric Multi-Processing (SMP) system or a Heterogeneous Multi-Processing (HMP) system. The difference between SMP and HMP systems is that the multiple CPUs of the SMP system perform identically, while the multiple CPUs of the HMP system do not.
It should be noted that each CPU has different computing power, and even in the case where the computing power of each CPU is the same, since each CPU runs different tasks at different times, there is a difference in the remaining computing power. The residual computing power refers to the computing power of the residual CPU resources except the CPU resources occupied by the tasks running on the CPU at a certain moment.
After the cache hot is determined, the task is directly placed on a target CPU which runs before the task is in sleep to run, compared with the case that the task is dispatched to other CPUs, data and information related to the task need to be loaded into an L1/L2 cache from a memory (such as a dynamic memory DDR or a virtual memory) again, and the actual execution time and the power consumption are better.
And S208, if the loss times are greater than or equal to the residual threshold, loading the running data from the memory or the external memory and adding the running data into the cache, and scheduling the task to the processor for running.
If the number of cache losses is greater than or equal to the residual threshold value C, indicating that the running data is lost and cannot be directly run, it may be determined that the cache is not a hot cache hot, and the running data needs to be read from the memory into L1 and L2/L3 and then run.
As shown in fig. 7, which is a schematic diagram of the relationship between different levels of memories in a computing system, the access speeds of CPUs for the memories in different levels are greatly different, and the access speeds are gradually reduced from top to bottom, but the storage spaces are gradually increased.
When the task is awakened again, if the data in the cache is lost, the data is read from the physical memory or the virtual memory into the cache, and when the data does not exist in the physical memory or the virtual memory, the data is read from the permanent storage area at a slower speed.
After the data is loaded into the cache, the task can be continuously put on a target CPU which is operated before the task is in sleep to operate, and can also be put on any other CPU to operate.
In the embodiment of the application, whether the processor on which the task runs before sleeping is a hot cache or not is judged by counting the loss times of running data in the cache during the sleeping period of the task based on the PMU module carried by the processor, the loss times are objective and accurate, and the accuracy of judging the hot cache is improved. Meanwhile, when the task is determined to be a hot cache on the processor running before the sleep, the task is continuously put on the processor to run by the scheduler, so that the running time of the task and the power consumption of the processor can be saved.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Referring to fig. 8, a schematic structural diagram of a heat buffer identification apparatus according to an exemplary embodiment of the present application is shown. The hot cache identification means may be implemented as all or part of the user terminal by software, hardware or a combination of both. The apparatus 1 comprises a loss number obtaining module 10, a residual threshold obtaining module 20 and a hot cache determining module 30.
A missing number obtaining module 10, configured to obtain, when a task is awakened, a number of times that running data corresponding to the task is lost in a cache during sleep of the task;
a residual threshold obtaining module 20, configured to obtain a residual threshold of the operating data in the cache;
a hot cache determining module 30, configured to determine that the task is a hot cache on a processor if the number of times of loss is smaller than the residual threshold, where the processor is a processor on which the task runs before sleeping.
Optionally, the cache includes an exclusive cache and/or a shared cache, and the loss number obtaining module 10 is specifically configured to:
acquiring the loss times of the running data corresponding to the task in the exclusive cache during the task sleep period; or the like, or, alternatively,
acquiring the loss times of the running data corresponding to the task in the shared cache during the sleep period of the task; or the like, or, alternatively,
acquiring a first loss frequency of the running data corresponding to the task in the exclusive cache and a second loss frequency of the running data corresponding to the task in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as the loss frequency.
Optionally, the residual threshold obtaining module 20 is specifically configured to:
acquiring the total byte number of the operating data in the exclusive cache and the row byte number of each row in the exclusive cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
Optionally, the residual threshold obtaining module 20 is specifically configured to:
acquiring the total byte number of the running data in the shared cache and the row byte number of each row in the shared cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
Optionally, the residual threshold obtaining module 20 is specifically configured to:
acquiring a first total byte number of the operating data in the exclusive cache and a first row byte number of each row in the exclusive cache;
calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operating data in the exclusive cache;
acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache;
calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache;
and determining the sum of the product of the first residual threshold and the first weight and the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache.
Optionally, as shown in fig. 9, the apparatus further includes a data reading module 40, configured to:
and if the loss times are greater than or equal to the residual threshold, loading the operating data from a memory or an external memory and adding the operating data to the cache.
Optionally, as shown in fig. 9, the apparatus further includes a task execution module 50, configured to:
and scheduling the task to be executed on the processor.
It should be noted that, when the thermal cache identification apparatus provided in the foregoing embodiment executes the thermal cache identification method, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiments of the thermal cache identification apparatus and the thermal cache identification method provided in the foregoing embodiments belong to the same concept, and details of implementation processes thereof are referred to in the method embodiments and are not described herein again.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the embodiment of the application, whether the processor on which the task runs before sleeping is a hot cache or not is judged by counting the loss times of running data in the cache during the sleeping period of the task based on the PMU module carried by the processor, the loss times are objective and accurate, and the accuracy of judging the hot cache is improved. Meanwhile, when the task is determined to be a hot cache on the processor running before the sleep, the task is continuously put on the processor to run by the scheduler, so that the running time of the task and the power consumption of the processor can be saved.
An embodiment of the present application further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, where the instructions are suitable for being loaded by a processor and executing the method steps in the embodiments shown in fig. 1 to 7, and a specific execution process may refer to specific descriptions of the embodiments shown in fig. 1 to 7, which are not described herein again.
Please refer to fig. 10, which provides a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 10, the electronic device 1000 may include: at least one processor 1001, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002.
Wherein a communication bus 1002 is used to enable connective communication between these components.
The user interface 1003 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
The Memory 1005 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer-readable medium. The memory 1005 may be used to store an instruction, a program, code, a set of codes, or a set of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 10, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a hot cache identification application program.
In the electronic device 1000 shown in fig. 10, the user interface 1003 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 1001 may be configured to call the hot cache identification application stored in the memory 1005, and specifically perform the following operations:
when a task is awakened, acquiring the loss times of running data corresponding to the task in a cache during the sleep period of the task;
obtaining a residual threshold value of the running data in the cache;
if the number of times of loss is less than the residual threshold, determining that the task is a hot cache on a processor, wherein the processor is a processor on which the task runs before sleeping.
In an embodiment, the cache includes an exclusive cache and/or a shared cache, and when the processor 1001 acquires the number of times of loss of the running data corresponding to the task in the cache during the task sleep period, the following operation is specifically performed:
acquiring the loss times of the running data corresponding to the task in the exclusive cache during the task sleep period; or the like, or, alternatively,
acquiring the loss times of the running data corresponding to the task in the shared cache during the sleep period of the task; or the like, or, alternatively,
acquiring a first loss frequency of the running data corresponding to the task in the exclusive cache and a second loss frequency of the running data corresponding to the task in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as the loss frequency.
In an embodiment, when the processor 1001 acquires the residual threshold of the running data in the cache, the following operations are specifically performed:
acquiring the total byte number of the operating data in the exclusive cache and the row byte number of each row in the exclusive cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
In an embodiment, when the processor 1001 acquires the residual threshold of the running data in the cache, the following operations are specifically performed:
acquiring the total byte number of the running data in the shared cache and the row byte number of each row in the shared cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
In an embodiment, when the processor 1001 calculates the residual threshold of the acquired running data in the cache, the following operations are specifically performed:
acquiring a first total byte number of the operating data in the exclusive cache and a first row byte number of each row in the exclusive cache;
calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operating data in the exclusive cache;
acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache;
calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache;
and determining the sum of the product of the first residual threshold and the first weight and the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache.
In one embodiment, the processor 1001 further performs the following operations:
and if the loss times are greater than or equal to the residual threshold, loading the operating data from a memory or an external memory and adding the operating data to the cache.
In one embodiment, the processor 1001 further performs the following operations:
and scheduling the task to be executed on the processor.
In the embodiment of the application, whether the processor on which the task runs before sleeping is a hot cache or not is judged by counting the loss times of running data in the cache during the sleeping period of the task based on the PMU module carried by the processor, the loss times are objective and accurate, and the accuracy of judging the hot cache is improved. Meanwhile, when the task is determined to be a hot cache on the processor running before the sleep, the task is continuously put on the processor to run by the scheduler, so that the running time of the task and the power consumption of the processor can be saved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.
Claims (10)
1. A method for identifying a hot cache, the method comprising:
when a task is awakened, acquiring the loss times of running data corresponding to the task in a cache during the sleep period of the task;
obtaining a residual threshold value of the running data in the cache;
if the number of times of loss is less than the residual threshold, determining that the task is a hot cache on a processor, wherein the processor is a processor on which the task runs before sleeping.
2. The method according to claim 1, wherein the cache includes an exclusive cache and/or a shared cache, and the obtaining a number of times that running data corresponding to the task is lost in the cache during the task sleep period includes:
acquiring the loss times of the running data corresponding to the task in the exclusive cache during the task sleep period; or the like, or, alternatively,
acquiring the loss times of the running data corresponding to the task in the shared cache during the sleep period of the task; or the like, or, alternatively,
acquiring a first loss frequency of the running data corresponding to the task in the exclusive cache and a second loss frequency of the running data corresponding to the task in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as the loss frequency.
3. The method of claim 2, wherein obtaining the residual threshold of the operational data in the cache comprises:
acquiring the total byte number of the operating data in the exclusive cache and the row byte number of each row in the exclusive cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
4. The method of claim 2, wherein obtaining the residual threshold of the operational data in the cache comprises:
acquiring the total byte number of the running data in the shared cache and the row byte number of each row in the shared cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
5. The method of claim 2, wherein obtaining the residual threshold of the operational data in the cache comprises:
acquiring a first total byte number of the operating data in the exclusive cache and a first row byte number of each row in the exclusive cache;
calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operating data in the exclusive cache;
acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache;
calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache;
and determining the sum of the product of the first residual threshold and the first weight and the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache.
6. The method of claim 1, further comprising:
and if the loss times are greater than or equal to the residual threshold, loading the operating data from a memory or an external memory and adding the operating data to the cache.
7. The method of claim 1 or 6, further comprising:
and scheduling the task to be executed on the processor.
8. An apparatus for identifying a thermal cache, the apparatus comprising:
the loss times acquisition module is used for acquiring the loss times of the running data corresponding to the task in the cache during the sleep period of the task when the task is awakened;
a residual threshold obtaining module, configured to obtain a residual threshold of the operating data in the cache;
a hot cache determination module, configured to determine that the task is a hot cache on a processor if the number of times of loss is smaller than the residual threshold, where the processor is a processor on which the task is running before sleeping.
9. A computer storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor and to carry out the method steps according to any one of claims 1 to 7.
10. An electronic device, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010149714.4A CN113360192A (en) | 2020-03-06 | 2020-03-06 | Thermal cache identification method and device, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010149714.4A CN113360192A (en) | 2020-03-06 | 2020-03-06 | Thermal cache identification method and device, storage medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113360192A true CN113360192A (en) | 2021-09-07 |
Family
ID=77523887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010149714.4A Pending CN113360192A (en) | 2020-03-06 | 2020-03-06 | Thermal cache identification method and device, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113360192A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102081551A (en) * | 2011-01-28 | 2011-06-01 | 中国人民解放军国防科学技术大学 | Micro-architecture sensitive thread scheduling (MSTS) method |
CN106681830A (en) * | 2016-12-21 | 2017-05-17 | 深圳先进技术研究院 | Task cache space monitoring method and device |
CN109815425A (en) * | 2018-12-14 | 2019-05-28 | 平安科技(深圳)有限公司 | Caching data processing method, device, computer equipment and storage medium |
-
2020
- 2020-03-06 CN CN202010149714.4A patent/CN113360192A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102081551A (en) * | 2011-01-28 | 2011-06-01 | 中国人民解放军国防科学技术大学 | Micro-architecture sensitive thread scheduling (MSTS) method |
CN106681830A (en) * | 2016-12-21 | 2017-05-17 | 深圳先进技术研究院 | Task cache space monitoring method and device |
CN109815425A (en) * | 2018-12-14 | 2019-05-28 | 平安科技(深圳)有限公司 | Caching data processing method, device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11550627B2 (en) | Hardware accelerated dynamic work creation on a graphics processing unit | |
CN110308982B (en) | Shared memory multiplexing method and device | |
CN108549574B (en) | Thread scheduling management method and device, computer equipment and storage medium | |
US10346212B2 (en) | Approach for a configurable phase-based priority scheduler | |
US9069609B2 (en) | Scheduling and execution of compute tasks | |
JP2018534675A (en) | Task subgraph acceleration by remapping synchronization | |
CN103197916A (en) | Methods and apparatus for source operand collector caching | |
CN104050033A (en) | System and method for hardware scheduling of indexed barriers | |
CN104050032A (en) | System and method for hardware scheduling of conditional barriers and impatient barriers | |
US20200293866A1 (en) | Methods for improving ai engine mac utilization | |
US20120297216A1 (en) | Dynamically selecting active polling or timed waits | |
CN110032450B (en) | Large-scale deep learning method and system based on solid-state disk extended memory | |
JP2018528515A (en) | A method for a simplified task-based runtime for efficient parallel computing | |
US12020065B2 (en) | Hierarchical processor selection | |
US9715413B2 (en) | Execution state analysis for assigning tasks to streaming multiprocessors | |
CN103294449B (en) | The pre-scheduling dissipating operation is recurred | |
CN114153500A (en) | Instruction scheduling method, instruction scheduling device, processor and storage medium | |
CN115981833A (en) | Task processing method and device | |
TW202109286A (en) | System and architecture of pure functional neural network accelerator | |
KR20150101870A (en) | Method and apparatus for avoiding bank conflict in memory | |
CN103218259A (en) | Computer-implemented method for selection of a processor, which is incorporated in multiple processors to receive work, which relates to an arithmetic problem | |
CN101847128A (en) | TLB management method and device | |
CN113032154B (en) | Scheduling method and device for virtual CPU, electronic equipment and storage medium | |
CN113360192A (en) | Thermal cache identification method and device, storage medium and electronic equipment | |
CN112114967B (en) | GPU resource reservation method based on service priority |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20240712 |
|
AD01 | Patent right deemed abandoned |