CN113360192A - Thermal cache identification method and device, storage medium and electronic equipment - Google Patents

Thermal cache identification method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN113360192A
CN113360192A CN202010149714.4A CN202010149714A CN113360192A CN 113360192 A CN113360192 A CN 113360192A CN 202010149714 A CN202010149714 A CN 202010149714A CN 113360192 A CN113360192 A CN 113360192A
Authority
CN
China
Prior art keywords
cache
task
byte number
residual threshold
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010149714.4A
Other languages
Chinese (zh)
Inventor
崔晓刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010149714.4A priority Critical patent/CN113360192A/en
Publication of CN113360192A publication Critical patent/CN113360192A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4418Suspend and resume; Hibernate and awake
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44568Immediately runnable code
    • G06F9/44578Preparing or optimising for loading

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the application discloses a method and a device for identifying a thermal cache, a storage medium and an electronic device, wherein the method comprises the following steps: when a task is awakened, acquiring the loss times of running data corresponding to the task in a cache during the sleep period of the task; obtaining a residual threshold value of the running data in the cache; and if the loss times are less than the residual threshold value, determining the cache as hot cache. Therefore, by adopting the embodiment of the application, the hot cache is judged by counting the loss times of the running data in the cache during the sleep period of the task, the loss times are objective and accurate, and the accuracy of judging the hot cache is improved.

Description

Thermal cache identification method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying a thermal cache, a storage medium, and an electronic device.
Background
When a task is awakened again after sleeping, if the sleeping time is lower than a certain time threshold, it is determined that the processor on which the task runs before sleeping is a hot cache, the task may be preferentially put on the processor to continue running, and if the task is scheduled to run on the rest processors in the system, the task related running data needs to be loaded into the cache again from a memory (such as a dynamic memory or a virtual memory). Therefore, the execution time and the power consumption of the running processor are better when the running processor is placed on the original processor.
Such an optimization is based on the assumption that if the sleep time is below a certain time threshold, there is a high probability that the running data of the task will remain in the cache, rather than being retained. The time threshold is difficult to set to a suitable value, resulting in insufficient accuracy in identifying whether the task is a hot cache on the original processor.
Disclosure of Invention
The embodiment of the application provides a method and a device for identifying a thermal cache, a storage medium and electronic equipment, wherein the thermal cache is judged by counting the loss times of running data in the cache during the sleep period of a task, the loss times are objective and accurate, and the accuracy of identifying the thermal cache on a processor on which the task runs before the sleep is improved.
The technical scheme is as follows:
in a first aspect, an embodiment of the present application provides a method for identifying a hot cache, where the method includes:
when a task is awakened, acquiring the loss times of running data corresponding to the task in a cache during the sleep period of the task;
obtaining a residual threshold value of the running data in the cache;
and if the loss times are less than the residual threshold value, determining the cache as hot cache.
In a second aspect, an embodiment of the present application provides an apparatus for identifying a thermal cache, where the apparatus includes:
the loss times acquisition module is used for acquiring the loss times of the running data corresponding to the task in the cache during the sleep period of the task when the task is awakened;
a residual threshold obtaining module, configured to obtain a residual threshold of the operating data in the cache;
and if the loss times are less than the residual threshold value, determining the cache as hot cache.
In a third aspect, embodiments of the present application provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.
In a fourth aspect, an embodiment of the present application provides an electronic device, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.
The beneficial effects brought by the technical scheme provided by some embodiments of the application at least comprise:
in the embodiment of the application, when a task is awakened, the number of times of losing the running data corresponding to the task in the cache during the sleep period of the task is acquired, the residual threshold value of the running data in the cache is acquired, and if the number of times of losing is smaller than the residual threshold value, the hot cache is determined. The hot cache is judged by counting the loss times of the running data in the cache during the sleep period of the task, the loss times are objective and accurate, and the accuracy of identifying that the running data of the task is the hot cache on the processor before the sleep is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for identifying a hot cache according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating an example of task sleep time provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of a cache and a processor according to an embodiment of the present disclosure;
fig. 4 is a schematic flowchart of a method for identifying a hot cache according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a cache and a processor according to an embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram of a multi-CPU system according to an embodiment of the present application;
FIG. 7 is a diagram illustrating a system memory according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a thermal cache identification apparatus according to an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of a thermal cache identification apparatus according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
In the description of the present application, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art. Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
The following describes in detail a hot cache identification method provided by an embodiment of the present application with reference to fig. 1 to 7. The method may be implemented in dependence on a computer program, operable on a von neumann based thermal cache identification device. The computer program may be integrated into the application or may run as a separate tool-like application. The hot cache identification device in the embodiment of the present application may be a user terminal, where the user terminal includes but is not limited to: a smartphone, personal computer, tablet, handheld device, wearable device, computing device, or other processing device connected to a wireless modem, and the like.
Fig. 1 is a schematic flow chart of a method for identifying a hot cache according to an embodiment of the present disclosure. As shown in fig. 1, the method of the embodiment of the present application may include the steps of:
s101, when a task is awakened, obtaining the loss times of running data corresponding to the task in a cache during the sleep period of the task;
it is to be understood that a task may be viewed as a thread. At least one thread may be involved in a process in general. The thread can utilize the resources owned by the process, and in an operating system introducing the thread, the process is generally used as a basic unit for allocating the resources, and the thread is used as a basic unit for independently running and independently scheduling. Because the thread is smaller than the process and basically does not have system resources, the overhead for scheduling the thread is smaller, and the concurrent execution degree among a plurality of programs of the system can be more efficiently improved.
Each task is in one of 5 states of a sleep state, a ready state, a running state, a suspended state (waiting for some event to occur) and an interrupted state, and the running state of each task is the same as that of the process to which the task belongs. In this embodiment of the present application, the running state corresponding to the task is a sleep state and enters a ready state, that is, the task is a task to be awakened.
Before the task enters the sleep state, if the task is interrupted and preempted in the executing process, the task executes the sleep operation and enters the sleep state.
Specifically, when the task is awakened again, the number of times of loss or hit of the running data, information, and the like of the task in the cache during the continuous sleep period before the task is awakened can be counted through a Performance Monitor Unit (PMU) integrated inside an ARM chip of the system. Where PMUs may count relevant data from the most basic processor instruction level.
As multiple times of sleep may exist during the period from the start to the end of the same task, as shown in fig. 2, the task is started from T0, and is ended at T1, and is the first sleep during the period from T1 to T2, and is the second sleep during the period from T3 to T4, if the current wake-up time is T4, the counted cache loss times are the cache loss times at the time from T3 to T4.
The loss means that a certain read instruction cannot find the corresponding content to be read from the cache, and if the certain read instruction is lost, a certain line in the original cache needs to be discarded. When reading a row, if not, the number of times of loss is + 1.
Optionally, the cache includes an exclusive cache and/or a shared cache. The exclusive cache is exclusive to each CPU in the CPUs contained in the system, and the shared cache is shared by the CPUs contained in the system. When the running data is only located in an exclusive cache, acquiring the number of times of losing the running data corresponding to the task in the exclusive cache during the task sleep period; or, when the running data is only located in a shared cache, acquiring the number of times of loss of the running data corresponding to the task in the shared cache during the sleep period of the task; or, when the running data is located in an exclusive cache and a shared cache, acquiring a first loss frequency of the running data corresponding to the task in the exclusive cache and a second loss frequency of the running data corresponding to the task in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as the loss frequency.
It should be noted that the cache is a temporary data exchanger between the CPU of the processor and the memory, and is used for solving the contradiction between the operating processing speed of the CPU and the read-write speed of the memory, and the cache speed is faster than the memory speed. The cache is generally integrated directly with the CPU chip or on a separate chip interconnected with the motherboard bus, and the cache at this stage is generally integrated directly on the CPU, as shown in fig. 3. The CPU usually needs to repeatedly process the same data and repeatedly execute the same instruction, and if the data and the instruction can be found in the cache, the CPU does not need to read the data and the instruction from the memory or the hard disk, so that the response time of the whole machine is reduced.
The CPU is an operation core and a control core of a computer and is a final execution unit for information processing and program operation. The CPU includes an arithmetic logic unit, a register unit, a control unit, and the like, and has functions of processing instructions, performing operations, controlling time, processing data, and the like. The performance of a CPU is mainly reflected in the speed at which it runs programs. The performance indexes affecting the operating speed include parameters such as the operating frequency of the CPU, the cache capacity, the instruction system and the logic structure.
S102, obtaining a residual threshold value of the running data in the cache;
and the residual threshold is a critical value which is updated and replaced after the running data of the target task in the cache is lost.
Assuming that the size of the cache is M bytes, and the size of each line of the cache is L bytes, the cache comprises M/L, and after M/L losses, the cache content is considered to be completely updated. If the remaining 50% of the information content is used as the threshold point, the remaining threshold is 50% x M/L.
Optionally, when the operation data is only located in an exclusive cache, obtaining a total number of bytes of the operation data in the exclusive cache and a number of row bytes of each row in the exclusive cache, calculating a quotient of the total number of bytes and the number of row bytes, and determining a product of the quotient and a preset ratio as a residual threshold of the operation data in the cache. When the operation data is only located in a shared cache, acquiring the total byte number of the operation data in the shared cache and the row byte number of each row in the shared cache, calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as the residual threshold value of the operation data in the cache. When the operation data are located in an exclusive cache and a shared cache, acquiring a first total byte number of the operation data in the exclusive cache and a first row byte number of each row in the exclusive cache, calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operation data in the exclusive cache; acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache, calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache; and determining the sum of the product of the first residual threshold and the first weight and the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache.
S103, if the lost times are smaller than the residual threshold value, determining that the task is a hot cache on a processor, wherein the processor is the processor on which the task runs before sleeping.
If the number of cache losses is lower than the residual threshold, it is indicated that most of the running data of the task is retained in the cache, and the running data does not need to be read from the memory into the cache but can be directly run based on the cached data, and it can be determined that the task is a hot cache hot on the processor running before the sleep.
In the embodiment of the application, when a task is awakened, the number of times of losing the running data corresponding to the task in the cache during the sleep period of the task is acquired, the residual threshold value of the running data in the cache is acquired, and if the number of times of losing is smaller than the residual threshold value, the hot cache is determined. The hot cache is judged by counting the loss times of the running data in the cache during the sleep period of the task, the loss times are objective and accurate, and the accuracy of identifying that the running data of the task is the hot cache on the processor before the sleep is improved.
Fig. 4 is a schematic flow chart of a method for identifying a hot cache according to an embodiment of the present disclosure. The difference between the embodiment shown in fig. 4 and the embodiment shown in fig. 1 is that the process of identifying a task as a hot cache is described by taking an example that a cache corresponding to the task includes an exclusive cache and a shared cache (that is, a part of running data corresponding to the task is stored in the exclusive cache and a part of the running data is stored in the shared cache), and when the task is confirmed as a hot cache, the running CPU running before sleep is used to run the task, and when the task is confirmed not to be a hot cache, the running data is read from the memory and then the task is run. The hot cache identification method can comprise the following steps:
s201, when a task is awakened, acquiring a first loss frequency of running data corresponding to the task in the exclusive cache and a second loss frequency in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as a loss frequency;
the CPU includes three levels of caches L1, L2, and L3, where L1 is a first level cache, L2 is a second level cache, and L3 is a third level cache, which are all caches integrated in the CPU, and they all function as a high-speed data buffer between the CPU and the main memory, as shown in fig. 5. L1 is closest to the CPU core; l2 second; l3 again. In the aspect of operation speed: l1 fastest, L2 times faster, L3 slowest; capacity size: l1 smallest, L2 larger, L3 largest.
The L1 is an exclusive cache, that is, unique to each CPU, if the system has multiple CPUs, then there are multiple L1, and L2 and L3 are shared caches, for example, for an 8-core CPU, if the system is divided into 2 groups of CPUs according to the performance optimization and the power consumption optimization, and each group has 4 CPUs, then the 2 groups of CPUs respectively correspond to one L2, one of which is shared by 4 CPUs with the power consumption optimization, and the other is shared by 4 CPUs with the performance optimization. L3 is shared by all CPUs in the system, and again taking an 8-core CPU as an example, these 8 CPUs share one L3.
When a part of the running data of the task is cached in the exclusive cache L1 and another part is cached in the shared cache L2, the CPU searches the fastest L1 for the data and then searches the second fastest L2 or L3DE for the running data.
Specifically, when a task is woken up again, a PMU integrated inside an ARM chip of the system may count a first loss number n1 of running data, information, and the like of the task in the exclusive cache L1 and a second loss number n2 in the shared cache L2 and/or L3 during a continuous sleep period before the task is woken up, so that the loss number n is n1+ n 2. A miss means that a read instruction fails to find the corresponding contents to be read from the cache, and if so, means that the original line in the cache needs to be discarded. When reading a row, if not, the number of times of loss is + 1. Where PMUs may count relevant data from the most basic processor instruction level.
S202, acquiring a first total byte number of the operating data in the exclusive cache and a first row byte number of each row in the exclusive cache;
if a first total byte number M1 in the running data is cached in the L1, the byte number of the first row included in each row in the L1 is N1.
It should be noted that reading the run data from the buffer is performed row by row.
S203, calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operating data in the exclusive cache;
the first predetermined ratio is X1 and may be predetermined, such as 50%.
The first residual threshold C1 ═ M1/N1 × X1.
S204, acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache;
if the first total byte number M2 in the running data is cached in L2/L3, the byte number of the first row included in each row in L2/L3 is N2.
S205, calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache;
the second predetermined ratio is X2 and may be predetermined, such as 50%.
The second residual threshold C2 ═ M2/N2 × X2.
S206, determining the product of the first residual threshold and the first weight and the sum of the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache;
if the first weight is a1 and the second weight is a2, which may be preset, the residual threshold C is a1 × C1+ a2 × C2.
S207, if the lost times are smaller than the residual threshold, determining that the task is a hot cache on a processor, and scheduling the task to the processor for running, wherein the processor is the processor on which the task runs before sleeping;
if the cache loss times are lower than the residual threshold value C, it indicates that the running data does not need to be read from the memory into L1 and L2/L3 and can be directly run based on the cached data, and the cache hot can be determined.
In addition, the embodiment of the application is applied to a multi-CPU system. multi-CPU Systems typically include four different forms, namely Multiprocessor Systems, Multicomputer Systems, Network Systems and Distributed Systems.
The multi-CPU system comprises two or more processors with similar functions, the processors can exchange data with each other, all the processors share a memory, I/O equipment, a controller and external equipment, the whole hardware system is controlled by a unified operating system, and all levels of operation, tasks, programs, arrays and elements are comprehensively parallel between the processors and the programs.
For example, as shown in fig. 6, a schematic structural diagram of a multi-CPU system is shown, where the system includes multiple CPUs sharing a memory. The Multi-CPU system may be a symmetric Multi-Processing (SMP) system or a Heterogeneous Multi-Processing (HMP) system. The difference between SMP and HMP systems is that the multiple CPUs of the SMP system perform identically, while the multiple CPUs of the HMP system do not.
It should be noted that each CPU has different computing power, and even in the case where the computing power of each CPU is the same, since each CPU runs different tasks at different times, there is a difference in the remaining computing power. The residual computing power refers to the computing power of the residual CPU resources except the CPU resources occupied by the tasks running on the CPU at a certain moment.
After the cache hot is determined, the task is directly placed on a target CPU which runs before the task is in sleep to run, compared with the case that the task is dispatched to other CPUs, data and information related to the task need to be loaded into an L1/L2 cache from a memory (such as a dynamic memory DDR or a virtual memory) again, and the actual execution time and the power consumption are better.
And S208, if the loss times are greater than or equal to the residual threshold, loading the running data from the memory or the external memory and adding the running data into the cache, and scheduling the task to the processor for running.
If the number of cache losses is greater than or equal to the residual threshold value C, indicating that the running data is lost and cannot be directly run, it may be determined that the cache is not a hot cache hot, and the running data needs to be read from the memory into L1 and L2/L3 and then run.
As shown in fig. 7, which is a schematic diagram of the relationship between different levels of memories in a computing system, the access speeds of CPUs for the memories in different levels are greatly different, and the access speeds are gradually reduced from top to bottom, but the storage spaces are gradually increased.
When the task is awakened again, if the data in the cache is lost, the data is read from the physical memory or the virtual memory into the cache, and when the data does not exist in the physical memory or the virtual memory, the data is read from the permanent storage area at a slower speed.
After the data is loaded into the cache, the task can be continuously put on a target CPU which is operated before the task is in sleep to operate, and can also be put on any other CPU to operate.
In the embodiment of the application, whether the processor on which the task runs before sleeping is a hot cache or not is judged by counting the loss times of running data in the cache during the sleeping period of the task based on the PMU module carried by the processor, the loss times are objective and accurate, and the accuracy of judging the hot cache is improved. Meanwhile, when the task is determined to be a hot cache on the processor running before the sleep, the task is continuously put on the processor to run by the scheduler, so that the running time of the task and the power consumption of the processor can be saved.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Referring to fig. 8, a schematic structural diagram of a heat buffer identification apparatus according to an exemplary embodiment of the present application is shown. The hot cache identification means may be implemented as all or part of the user terminal by software, hardware or a combination of both. The apparatus 1 comprises a loss number obtaining module 10, a residual threshold obtaining module 20 and a hot cache determining module 30.
A missing number obtaining module 10, configured to obtain, when a task is awakened, a number of times that running data corresponding to the task is lost in a cache during sleep of the task;
a residual threshold obtaining module 20, configured to obtain a residual threshold of the operating data in the cache;
a hot cache determining module 30, configured to determine that the task is a hot cache on a processor if the number of times of loss is smaller than the residual threshold, where the processor is a processor on which the task runs before sleeping.
Optionally, the cache includes an exclusive cache and/or a shared cache, and the loss number obtaining module 10 is specifically configured to:
acquiring the loss times of the running data corresponding to the task in the exclusive cache during the task sleep period; or the like, or, alternatively,
acquiring the loss times of the running data corresponding to the task in the shared cache during the sleep period of the task; or the like, or, alternatively,
acquiring a first loss frequency of the running data corresponding to the task in the exclusive cache and a second loss frequency of the running data corresponding to the task in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as the loss frequency.
Optionally, the residual threshold obtaining module 20 is specifically configured to:
acquiring the total byte number of the operating data in the exclusive cache and the row byte number of each row in the exclusive cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
Optionally, the residual threshold obtaining module 20 is specifically configured to:
acquiring the total byte number of the running data in the shared cache and the row byte number of each row in the shared cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
Optionally, the residual threshold obtaining module 20 is specifically configured to:
acquiring a first total byte number of the operating data in the exclusive cache and a first row byte number of each row in the exclusive cache;
calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operating data in the exclusive cache;
acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache;
calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache;
and determining the sum of the product of the first residual threshold and the first weight and the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache.
Optionally, as shown in fig. 9, the apparatus further includes a data reading module 40, configured to:
and if the loss times are greater than or equal to the residual threshold, loading the operating data from a memory or an external memory and adding the operating data to the cache.
Optionally, as shown in fig. 9, the apparatus further includes a task execution module 50, configured to:
and scheduling the task to be executed on the processor.
It should be noted that, when the thermal cache identification apparatus provided in the foregoing embodiment executes the thermal cache identification method, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiments of the thermal cache identification apparatus and the thermal cache identification method provided in the foregoing embodiments belong to the same concept, and details of implementation processes thereof are referred to in the method embodiments and are not described herein again.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the embodiment of the application, whether the processor on which the task runs before sleeping is a hot cache or not is judged by counting the loss times of running data in the cache during the sleeping period of the task based on the PMU module carried by the processor, the loss times are objective and accurate, and the accuracy of judging the hot cache is improved. Meanwhile, when the task is determined to be a hot cache on the processor running before the sleep, the task is continuously put on the processor to run by the scheduler, so that the running time of the task and the power consumption of the processor can be saved.
An embodiment of the present application further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, where the instructions are suitable for being loaded by a processor and executing the method steps in the embodiments shown in fig. 1 to 7, and a specific execution process may refer to specific descriptions of the embodiments shown in fig. 1 to 7, which are not described herein again.
Please refer to fig. 10, which provides a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 10, the electronic device 1000 may include: at least one processor 1001, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002.
Wherein a communication bus 1002 is used to enable connective communication between these components.
The user interface 1003 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
Processor 1001 may include one or more processing cores, among other things. The processor 1001 interfaces various components throughout the electronic device 1000 using various interfaces and lines to perform various functions of the electronic device 1000 and to process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1005 and invoking data stored in the memory 1005. Alternatively, the processor 1001 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1001 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 1001, but may be implemented by a single chip.
The Memory 1005 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer-readable medium. The memory 1005 may be used to store an instruction, a program, code, a set of codes, or a set of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 10, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a hot cache identification application program.
In the electronic device 1000 shown in fig. 10, the user interface 1003 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 1001 may be configured to call the hot cache identification application stored in the memory 1005, and specifically perform the following operations:
when a task is awakened, acquiring the loss times of running data corresponding to the task in a cache during the sleep period of the task;
obtaining a residual threshold value of the running data in the cache;
if the number of times of loss is less than the residual threshold, determining that the task is a hot cache on a processor, wherein the processor is a processor on which the task runs before sleeping.
In an embodiment, the cache includes an exclusive cache and/or a shared cache, and when the processor 1001 acquires the number of times of loss of the running data corresponding to the task in the cache during the task sleep period, the following operation is specifically performed:
acquiring the loss times of the running data corresponding to the task in the exclusive cache during the task sleep period; or the like, or, alternatively,
acquiring the loss times of the running data corresponding to the task in the shared cache during the sleep period of the task; or the like, or, alternatively,
acquiring a first loss frequency of the running data corresponding to the task in the exclusive cache and a second loss frequency of the running data corresponding to the task in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as the loss frequency.
In an embodiment, when the processor 1001 acquires the residual threshold of the running data in the cache, the following operations are specifically performed:
acquiring the total byte number of the operating data in the exclusive cache and the row byte number of each row in the exclusive cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
In an embodiment, when the processor 1001 acquires the residual threshold of the running data in the cache, the following operations are specifically performed:
acquiring the total byte number of the running data in the shared cache and the row byte number of each row in the shared cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
In an embodiment, when the processor 1001 calculates the residual threshold of the acquired running data in the cache, the following operations are specifically performed:
acquiring a first total byte number of the operating data in the exclusive cache and a first row byte number of each row in the exclusive cache;
calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operating data in the exclusive cache;
acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache;
calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache;
and determining the sum of the product of the first residual threshold and the first weight and the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache.
In one embodiment, the processor 1001 further performs the following operations:
and if the loss times are greater than or equal to the residual threshold, loading the operating data from a memory or an external memory and adding the operating data to the cache.
In one embodiment, the processor 1001 further performs the following operations:
and scheduling the task to be executed on the processor.
In the embodiment of the application, whether the processor on which the task runs before sleeping is a hot cache or not is judged by counting the loss times of running data in the cache during the sleeping period of the task based on the PMU module carried by the processor, the loss times are objective and accurate, and the accuracy of judging the hot cache is improved. Meanwhile, when the task is determined to be a hot cache on the processor running before the sleep, the task is continuously put on the processor to run by the scheduler, so that the running time of the task and the power consumption of the processor can be saved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (10)

1. A method for identifying a hot cache, the method comprising:
when a task is awakened, acquiring the loss times of running data corresponding to the task in a cache during the sleep period of the task;
obtaining a residual threshold value of the running data in the cache;
if the number of times of loss is less than the residual threshold, determining that the task is a hot cache on a processor, wherein the processor is a processor on which the task runs before sleeping.
2. The method according to claim 1, wherein the cache includes an exclusive cache and/or a shared cache, and the obtaining a number of times that running data corresponding to the task is lost in the cache during the task sleep period includes:
acquiring the loss times of the running data corresponding to the task in the exclusive cache during the task sleep period; or the like, or, alternatively,
acquiring the loss times of the running data corresponding to the task in the shared cache during the sleep period of the task; or the like, or, alternatively,
acquiring a first loss frequency of the running data corresponding to the task in the exclusive cache and a second loss frequency of the running data corresponding to the task in the shared cache during the sleep period of the task, and determining the sum of the first loss frequency and the second loss frequency as the loss frequency.
3. The method of claim 2, wherein obtaining the residual threshold of the operational data in the cache comprises:
acquiring the total byte number of the operating data in the exclusive cache and the row byte number of each row in the exclusive cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
4. The method of claim 2, wherein obtaining the residual threshold of the operational data in the cache comprises:
acquiring the total byte number of the running data in the shared cache and the row byte number of each row in the shared cache;
and calculating the quotient of the total byte number and the row byte number, and determining the product of the quotient and a preset proportion as a residual threshold value of the operation data in the cache.
5. The method of claim 2, wherein obtaining the residual threshold of the operational data in the cache comprises:
acquiring a first total byte number of the operating data in the exclusive cache and a first row byte number of each row in the exclusive cache;
calculating a first quotient value of the first total byte number and the first row byte number, and determining a product of the first quotient value and a first preset proportion as a first residual threshold value of the operating data in the exclusive cache;
acquiring a second total byte number of the operating data in the shared cache and a second row byte number of each row in the shared cache;
calculating a second quotient value of the second total byte number and the second row byte number, and determining a product of the second quotient value and a second preset proportion as a second residual threshold value of the operating data in the shared cache;
and determining the sum of the product of the first residual threshold and the first weight and the product of the second residual threshold and the second weight as the residual threshold of the running data in the cache.
6. The method of claim 1, further comprising:
and if the loss times are greater than or equal to the residual threshold, loading the operating data from a memory or an external memory and adding the operating data to the cache.
7. The method of claim 1 or 6, further comprising:
and scheduling the task to be executed on the processor.
8. An apparatus for identifying a thermal cache, the apparatus comprising:
the loss times acquisition module is used for acquiring the loss times of the running data corresponding to the task in the cache during the sleep period of the task when the task is awakened;
a residual threshold obtaining module, configured to obtain a residual threshold of the operating data in the cache;
a hot cache determination module, configured to determine that the task is a hot cache on a processor if the number of times of loss is smaller than the residual threshold, where the processor is a processor on which the task is running before sleeping.
9. A computer storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor and to carry out the method steps according to any one of claims 1 to 7.
10. An electronic device, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1 to 7.
CN202010149714.4A 2020-03-06 2020-03-06 Thermal cache identification method and device, storage medium and electronic equipment Pending CN113360192A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010149714.4A CN113360192A (en) 2020-03-06 2020-03-06 Thermal cache identification method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010149714.4A CN113360192A (en) 2020-03-06 2020-03-06 Thermal cache identification method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN113360192A true CN113360192A (en) 2021-09-07

Family

ID=77523887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010149714.4A Pending CN113360192A (en) 2020-03-06 2020-03-06 Thermal cache identification method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113360192A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081551A (en) * 2011-01-28 2011-06-01 中国人民解放军国防科学技术大学 Micro-architecture sensitive thread scheduling (MSTS) method
CN106681830A (en) * 2016-12-21 2017-05-17 深圳先进技术研究院 Task cache space monitoring method and device
CN109815425A (en) * 2018-12-14 2019-05-28 平安科技(深圳)有限公司 Caching data processing method, device, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081551A (en) * 2011-01-28 2011-06-01 中国人民解放军国防科学技术大学 Micro-architecture sensitive thread scheduling (MSTS) method
CN106681830A (en) * 2016-12-21 2017-05-17 深圳先进技术研究院 Task cache space monitoring method and device
CN109815425A (en) * 2018-12-14 2019-05-28 平安科技(深圳)有限公司 Caching data processing method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US11550627B2 (en) Hardware accelerated dynamic work creation on a graphics processing unit
CN110308982B (en) Shared memory multiplexing method and device
CN108549574B (en) Thread scheduling management method and device, computer equipment and storage medium
US10346212B2 (en) Approach for a configurable phase-based priority scheduler
US9069609B2 (en) Scheduling and execution of compute tasks
JP2018534675A (en) Task subgraph acceleration by remapping synchronization
CN103197916A (en) Methods and apparatus for source operand collector caching
CN104050033A (en) System and method for hardware scheduling of indexed barriers
CN104050032A (en) System and method for hardware scheduling of conditional barriers and impatient barriers
US20200293866A1 (en) Methods for improving ai engine mac utilization
US20120297216A1 (en) Dynamically selecting active polling or timed waits
CN110032450B (en) Large-scale deep learning method and system based on solid-state disk extended memory
JP2018528515A (en) A method for a simplified task-based runtime for efficient parallel computing
US12020065B2 (en) Hierarchical processor selection
US9715413B2 (en) Execution state analysis for assigning tasks to streaming multiprocessors
CN103294449B (en) The pre-scheduling dissipating operation is recurred
CN114153500A (en) Instruction scheduling method, instruction scheduling device, processor and storage medium
CN115981833A (en) Task processing method and device
TW202109286A (en) System and architecture of pure functional neural network accelerator
KR20150101870A (en) Method and apparatus for avoiding bank conflict in memory
CN103218259A (en) Computer-implemented method for selection of a processor, which is incorporated in multiple processors to receive work, which relates to an arithmetic problem
CN101847128A (en) TLB management method and device
CN113032154B (en) Scheduling method and device for virtual CPU, electronic equipment and storage medium
CN113360192A (en) Thermal cache identification method and device, storage medium and electronic equipment
CN112114967B (en) GPU resource reservation method based on service priority

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20240712

AD01 Patent right deemed abandoned