WO2018121242A1 - Multiple buffer-based data elimination method and device - Google Patents

Multiple buffer-based data elimination method and device Download PDF

Info

Publication number
WO2018121242A1
WO2018121242A1 PCT/CN2017/115616 CN2017115616W WO2018121242A1 WO 2018121242 A1 WO2018121242 A1 WO 2018121242A1 CN 2017115616 W CN2017115616 W CN 2017115616W WO 2018121242 A1 WO2018121242 A1 WO 2018121242A1
Authority
WO
WIPO (PCT)
Prior art keywords
cache
thread pool
level
data
thread
Prior art date
Application number
PCT/CN2017/115616
Other languages
French (fr)
Chinese (zh)
Inventor
王文铎
陈宗志
彭信东
王康
Original Assignee
北京奇虎科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京奇虎科技有限公司 filed Critical 北京奇虎科技有限公司
Publication of WO2018121242A1 publication Critical patent/WO2018121242A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies

Definitions

  • the present disclosure relates to the field of computer technologies, and in particular, to a data elimination method and apparatus based on multiple caches.
  • Cache is an important technology used to solve the speed mismatch between high and low speed devices. It is widely used in various fields such as storage systems, databases, web servers, processors, file systems, Disk systems, etc., can reduce application response time and improve efficiency.
  • the storage medium used in the Cache technology such as RAM and SSD, has higher performance and is more expensive. For the sake of cost performance, the capacity of the Cache is limited, so it is necessary to effectively manage the Cache space. Therefore, a variety of Cache elimination algorithms have emerged, such as: Least Recently Used (LRU) elimination algorithm; Recently Least Frequently Used (LFU) elimination algorithm; Most recently used (Most Recently Used) MRU) elimination algorithm; adaptive Cache (Adaptive Replacement Cache, ARC) elimination algorithm.
  • LRU Least Recently Used
  • LFU Least Frequently Used
  • MRU Most recently used
  • MRU Most recently used
  • adaptive Cache Adaptive Replacement Cache, ARC
  • the prior art elimination algorithm is generally a single-threaded processing method, and the processing efficiency is low, and the lower processing efficiency sometimes causes After the limited cache space is used up, it cannot be vacated in time, and the subsequent data cannot be stored in time.
  • the present disclosure has been made in order to provide a multi-cache based data elimination method and corresponding apparatus that overcomes the above problems or at least partially solves the above problems.
  • a data culling method based on multiple caches including:
  • each thread pool includes multiple threads
  • Each cache is scanned by multiple threads in each thread pool, and the cache level of each cache is determined according to the scan result and the level division rule;
  • the data in the cache with the cache level matching the thread pool is eliminated by using multiple threads in each thread pool.
  • a data culling device based on a plurality of caches including:
  • a dividing module configured to divide a plurality of cache levels according to a preset level dividing rule, and respectively create a matching thread pool for each cache level; wherein each thread pool includes multiple threads;
  • a scanning module configured to scan each cache separately by using multiple threads in each thread pool, and determine a cache level of each cache according to the scan result and the level division rule;
  • the culling module is configured to use multiple threads in each thread pool to eliminate data in a cache whose cache level matches the thread pool.
  • a computer program comprising:
  • Computer readable code when the computer readable code is run on a computing device, causes the computing device to perform the plurality of cache based data culling methods described above.
  • a computer readable medium comprising:
  • a computer program for performing the above-described multi-cache based data elimination method is stored.
  • a method and an apparatus for eliminating data based on multiple caches may divide multiple cache levels according to a preset level division rule, respectively create matching thread pools for each cache level; and utilize multiple threads in each thread pool.
  • Each cache is scanned separately, and the cache level of each cache is determined according to the scan result and the level division rule; and the data in the cache whose cache level matches the thread pool is eliminated by using multiple threads in each thread pool. It can be seen that by dividing the cache into multiple cache levels and respectively creating corresponding thread pools for each cache level, the number of threads in the thread pool can be better adjusted according to the cache level; and, through multiple thread pools, parallel processing The way to greatly improve the efficiency of data elimination processing.
  • FIG. 1 is a schematic flow chart of a data elimination method based on multiple caches according to Embodiment 1 of the present disclosure
  • FIG. 2 is a schematic flowchart of a data elimination method based on multiple caches according to Embodiment 2 of the present disclosure
  • FIG. 3 is a schematic structural diagram of a data elimination device based on multiple caches according to Embodiment 3 of the present disclosure
  • FIG. 4 is a schematic structural diagram of a data elimination device based on multiple caches according to Embodiment 4 of the present disclosure
  • FIG. 5 schematically illustrates a block diagram of a computing device for performing a multiple cache based data retirement method in accordance with an embodiment of the present disclosure
  • FIG. 6 schematically illustrates a storage unit for maintaining or carrying program code that implements a plurality of cache-based data retirement methods in accordance with an embodiment of the present disclosure.
  • FIG. 1 is a schematic flowchart of a method for eliminating data based on multiple caches according to Embodiment 1 of the present disclosure. As shown in the figure, the method includes:
  • Step S110 Divide a plurality of cache levels according to a preset level division rule, and respectively create matching thread pools for each cache level.
  • the preset level division rule is used to divide each cache into different levels according to different usage conditions, and each cache in the same level has similar usage. This level is artificially defined by the technician.
  • the first embodiment of the present disclosure does not specifically limit this, and those skilled in the art can flexibly set according to actual conditions.
  • each thread pool contains multiple threads. Multiple threads in each thread pool are used for data elimination processing of the corresponding level of cache. Because different levels of cache usage are different, in order to optimize resource allocation as much as possible, the number of threads in the thread pool corresponding to different levels may also be different.
  • the reason why the thread pool technology is adopted is that if the corresponding thread is set for each cache, it will consume a lot of system resources and has no practical operation.
  • Step S120 Scan each cache separately by using multiple threads in each thread pool, and determine the cache level of each cache according to the scan result and the level division rule.
  • the embodiment of the present disclosure optimizes the workflow by classifying all caches and specifying the corresponding processing relationship between each thread and a different level of cache, thereby effectively avoiding the occurrence of the above conflict situation.
  • each cache is scanned by using multiple threads in each thread pool, and the cache level is determined for each scanned cache according to the scan result and the level division rule, for subsequent targeted processing.
  • Step S130 Using multiple threads in each thread pool to eliminate data in the cache whose cache level matches the thread pool.
  • the data having the corresponding cache level is subjected to data elimination processing by using a plurality of threads in the thread pool matching the respective cache levels.
  • the first embodiment of the present disclosure does not specifically limit this, and those skilled in the art can flexibly set according to actual conditions.
  • the data elimination method based on multiple caches may divide multiple cache levels according to a preset level division rule, and respectively create a match for each cache level. Thread pool; using multiple threads in each thread pool to scan each cache separately, determining the cache level of each cache according to the scan result and the hierarchical rule; using multiple threads in each thread pool to match the cache level with the thread pool
  • the data in the cache is eliminated. It can be seen that by dividing the cache into multiple cache levels and respectively creating corresponding thread pools for each cache level, the number of threads in the thread pool can be better adjusted according to the cache level; and, through multiple thread pools, parallel processing The way to greatly improve the efficiency of data elimination processing.
  • FIG. 2 is a schematic flowchart of a method for eliminating data based on multiple caches according to Embodiment 2 of the present disclosure. As shown in the figure, the method includes:
  • Step S210 Divide a plurality of cache levels according to a preset level division rule, and respectively create matching thread pools for each cache level.
  • the preset level division rule is used to divide each cache into different levels according to different usage conditions, and each cache in the same level has similar usage.
  • the level dividing rule includes: dividing a cache level according to a ratio between a remaining storage space of the cache and a total storage space, wherein a ratio between the remaining storage space and the total storage space is larger, and the cache level is higher. High; the smaller the ratio between the remaining storage space and the total storage space, the lower the cache level. For example, suppose the cache level is divided into three levels, which are high (HIGH) level, low (LOW) level, and idle (IDLE) level, where the ratio of the remaining storage space of the cache to the total storage space is above 60%.
  • the cache is determined to be the HIGH level; the cache between the remaining storage space of the cache and the total storage space is determined to be LOW level between 30% and 60%; the ratio of the remaining storage space of the cache to the total storage space The cache below 30% is determined to be the IDLE level.
  • each thread pool contains multiple threads. Multiple threads in each thread pool are used for data elimination processing of the corresponding level of cache. Because the usage of different levels of cache is different, in order to optimize the resource configuration as much as possible, the number of threads in the thread pool corresponding to different levels is also different.
  • Step S220 respectively set corresponding weight values for each thread pool, and set the number of threads included in each thread pool according to the weight values of the respective thread pools.
  • the weight value corresponding to the thread pool is set according to the level of the cache level matched with the thread pool, wherein the cache level matching the thread pool is higher.
  • the thread pool has a higher weight value; instead, the cache matches the thread pool The lower the level, the smaller the weight value of the thread pool.
  • the greater the weight value of the thread pool the greater the number of threads included in the thread pool; the smaller the weight value of the thread pool, the fewer the number of threads contained in the thread pool. Therefore, the number of threads contained in each thread pool is dynamically changing.
  • Step S230 Scan each cache separately by using multiple threads in each thread pool, and determine the cache level of each cache according to the scan result and the level division rule.
  • the embodiment of the present disclosure optimizes the workflow by classifying all caches and specifying the corresponding processing relationship between each thread and a different level of cache, thereby effectively avoiding the occurrence of the above conflict situation.
  • each cache is scanned by using multiple threads in each thread pool, and the cache level is determined for each scanned cache according to the scan result and the level division rule, for subsequent targeted processing.
  • the setting method of the weight value of the thread pool may further include: periodically acquiring the scan result of each thread pool, determining the number of caches corresponding to each cache level according to the scan result; and then adjusting each thread according to the number of caches corresponding to each cache level.
  • the weight value of the pool and adjust the number of threads included in each thread pool according to the adjusted weight value of each thread pool.
  • the greater the number of caches corresponding to the cache level the greater the weight of the thread pool matching the cache level.
  • the smaller the cache level corresponding to the cache level the smaller the weight value of the thread pool matching the cache level. .
  • the weight value setting method of the thread pool provided in step S220 and step S230 may be comprehensively used to set a more reasonable weight value of the thread pool.
  • the weight value of the thread pool can be further determined according to various factors such as the type and importance of the cache of the corresponding level.
  • Step S240 Using multiple threads in each thread pool to eliminate data in the cache whose cache level matches the thread pool.
  • the data having the corresponding cache level is subjected to data elimination processing by using a plurality of threads in the thread pool matching the respective cache levels. among them, Each thread pool can process only one cache level cache.
  • the cache is divided into three levels: a HIGH level, a LOW level, and an IDLE level, so only three thread pools are required.
  • the thread pool 1 corresponds to the HIGH level
  • the thread pool 2 corresponds to the LOW level
  • the thread pool 3 corresponds to the IDLE level.
  • each thread pool can also be used to handle multiple cache level caches.
  • the cache level includes six levels, it can also be processed by three thread pools, each of which handles two levels of cache.
  • step S230 and S240 may be repeatedly performed multiple times.
  • step S230 may be performed once every preset first time interval
  • step S240 may be performed once every preset second time interval.
  • the first time interval and the second time interval may be equal or may not be equal.
  • the first time interval and the second time interval may be either fixed values or dynamically changing values.
  • the first time interval may be dynamically adjusted according to the scan result: when the number of caches of the HIGH level is large in the scan result, the first time interval is reduced; when the number of caches of the HIGH level in the scan result is small, the first time is increased. interval.
  • each thread pool may perform the phase-out operation on the cache of the corresponding level according to the same execution cycle, or perform the phase-out operation on the cache of the corresponding level according to different execution cycles.
  • a thread pool for handling HIGH-level caches can perform data elimination operations with a shorter execution cycle to prevent insufficient free space for the HIGH-level cache; thread pools for handling IDLE-level caches can be longer.
  • the execution cycle performs data elimination operations to save system overhead.
  • the number of executions and the execution timing of the above-mentioned step S230 and step S240 can be determined in a variety of manners by a person skilled in the art according to actual needs, which is not limited by the disclosure. It can be seen that the division of the cache level and the application of the thread pool technology provide more flexibility and controllability for the data elimination operation, and can meet the needs of various scenarios.
  • a specific method for performing data elimination may be flexibly set by a person skilled in the art, which is not limited in the disclosure. For example, it can be eliminated based on various factors such as data write time, number of data writes, data temperature attributes, and data types.
  • data mining The method may be: calculating a temperature attribute value of each data in the cache according to a total number of times of writing each data in the cache and a preset temperature attribute calculation rule, and determining an elimination order of each data in the cache according to the temperature attribute value. .
  • the preset temperature attribute calculation rule is a rule for calculating the popularity degree of each cached data set by a person skilled in the art according to actual conditions.
  • the popularity of the cached data can be determined by factors such as the total number of times the cached data is written, and/or the storage period of the cached data.
  • the temperature attribute value of each cached data may be separately calculated according to the total number of writes of each cached data; the temperature attribute value of each cached data may be further calculated in combination with other factors.
  • the present disclosure does not limit the specific calculation rule of the temperature attribute value, as long as it can meet the actual needs of the user.
  • the cached data having the lowest temperature attribute value is sequentially eliminated according to the calculated temperature attribute values from low to high, thereby achieving the effect of eliminating the data according to the popularity of the cached data. And release the cache space in a timely and effective manner.
  • the total number of times of writing may be divided into a plurality of numerical intervals in advance, and corresponding intervals are respectively set for each numerical interval.
  • the score is determined and the temperature attribute value is determined based on the interval score. For example, when the total number of writes belongs to the value range [0, 10], the interval score is 1; when the total number of writes belongs to the value range [10, 50], the interval score is 5; When the total number of entries is in the range of [50,100], the interval score is 10.
  • the preset temperature attribute calculation rule may further include: further dividing the cache duration corresponding to the cache into a plurality of buffer periods, and respectively setting corresponding period weight values for each buffer period; For each cached data, the temperature attribute value of the cached data is determined according to the period weight value of the corresponding cache period when the cache data is written each time.
  • the buffer duration may be: a length of time defined by a first data write time corresponding to the data with the oldest write time in the cache and a second data write time corresponding to the latest data of the write time.
  • the buffer duration can also be a preset length of time. For example, suppose a cache is dedicated to storing cached data within the last three hours, and is automatically deleted once the write cached data is written for more than three hours. , the cache cache time is 3 hours. When the cache duration is divided into multiple cache periods, the entire cache duration can be divided into multiple An equal cache period can also divide the entire cache duration into multiple cache periods. In order to facilitate the calculation of the temperature attribute of the cache data according to the buffer period, after performing the above division, optionally, a period data table corresponding to the buffer period may be separately set for each buffer period, wherein each period data table is used for recording. The cached data written during the corresponding cache period.
  • the weight values of the respective time periods may be set to be equal, so that the temperature attribute value of the cached data is calculated from the aspect of the number of occurrences of each cached data; or, the time period may be followed according to the cache time period.
  • the subsequent order correspondingly sets an incremented (or decremented) time slot weight value, such that the focus is on combining the number of occurrences of the cached data with the write time to calculate the temperature attribute value of each cached data.
  • the weight value setting of each time period is determined by a person skilled in the art according to actual conditions, and the disclosure does not limit this.
  • the setting of the buffer period and the time weight value enables the user to prioritize the elimination of data in non-critical time periods according to actual needs, so that the elimination scheme is more flexible.
  • the foregoing preset level division rules may be further divided according to other types of data stored in the cache, in addition to the storage space, and in summary, the manner of dividing the cache level and the weight setting of the thread pool in the present disclosure.
  • the method is not limited.
  • each of the above thread pools runs in parallel with each other, thereby further improving data processing efficiency.
  • the data elimination method based on multiple caches may divide multiple cache levels according to a preset level division rule, respectively create matching thread pools for each cache level; use each thread pool.
  • Each of the multiple threads scans each cache, determines the cache level of each cache according to the scan result and the level division rule; and uses multiple threads in each thread pool to eliminate the data in the cache whose cache level matches the thread pool. Therefore, the problem of low efficiency of single-thread processing in the prior art is solved, and multi-thread parallel operation of data elimination operations on different weighted cache sets is realized, which ensures the consistency of data elimination and greatly improves the efficiency of data elimination processing. Through the improvement of the thread pool, the elimination priority of the cache set can be guaranteed in parallel.
  • FIG. 3 shows a data eliminator based on multiple caches provided by Embodiment 3 of the present disclosure.
  • the apparatus includes: a partitioning module 310, a scanning module 320, and a culling module 330.
  • the dividing module 310 is configured to divide a plurality of cache levels according to a preset level dividing rule, and respectively create matching thread pools for each cache level.
  • the preset level division rule is used to divide each cache into different levels according to different usage conditions, and each cache in the same level has similar usage. This level is artificially defined by the technician.
  • the third embodiment of the present disclosure does not specifically limit this, and those skilled in the art can flexibly set according to actual conditions.
  • the partitioning module 310 respectively creates matching thread pools for each cache level, and each thread pool contains multiple threads. Multiple threads in each thread pool are used for data elimination processing of the corresponding level of cache. Because different levels of cache usage are different, in order to optimize resource allocation as much as possible, the number of threads in the thread pool corresponding to different levels may also be different.
  • the reason why the thread pool technology is adopted is that if the corresponding thread is set for each cache, it will consume a lot of system resources and has no practical operation.
  • the scanning module 320 is configured to separately scan each cache by using multiple threads in each thread pool, and determine the cache level of each cache according to the scan result and the level division rule.
  • the embodiment of the present disclosure optimizes the workflow by classifying all caches and specifying the corresponding processing relationship between each thread and a different level of cache, thereby effectively avoiding the occurrence of the above conflict situation.
  • the scanning module 320 scans each cache by using multiple threads in each thread pool, and determines a cache level for each scanned cache according to the scan result and the level division rule, for subsequent targeted processing.
  • the eliminating module 330 is configured to use multiple threads in each thread pool to eliminate data in a cache whose cache level matches the thread pool.
  • the culling module 330 performs data elimination processing on the cache having the corresponding cache level by using a plurality of threads in the thread pool matching the respective cache levels.
  • the third embodiment of the present disclosure does not specifically limit this. Those skilled in the art can flexibly set according to actual conditions.
  • the data culling device based on multiple caches can divide multiple cache levels according to a preset level division rule, and respectively create matching thread pools for each cache level;
  • the plurality of threads respectively scan each cache, determine the cache level of each cache according to the scan result and the level division rule; and use multiple threads in each thread pool to eliminate the data in the cache whose cache level matches the thread pool.
  • the number of threads in the thread pool can be better adjusted according to the cache level; and, through multiple thread pools, parallel processing The way to greatly improve the efficiency of data elimination processing.
  • FIG. 4 is a schematic structural diagram of a data eliminator device based on multiple caches according to Embodiment 4 of the present disclosure. As shown, the device includes: a partitioning module 410, a weighting module 420, a scanning module 430, and a culling module 440. .
  • the dividing module 410 is configured to divide a plurality of cache levels according to a preset level dividing rule, and respectively create matching thread pools for each cache level.
  • the preset level division rule is used to divide each cache into different levels according to different usage conditions, and each cache in the same level has similar usage.
  • the level dividing rule includes: dividing a cache level according to a ratio between a remaining storage space of the cache and a total storage space, wherein a ratio between the remaining storage space and the total storage space is larger, and the cache level is higher. High; the smaller the ratio between the remaining storage space and the total storage space, the lower the cache level. For example, suppose the cache level is divided into three levels, which are high (HIGH) level, low (LOW) level, and idle (IDLE) level, where the ratio of the remaining storage space of the cache to the total storage space is above 60%.
  • the cache is determined to be the HIGH level; the cache between the remaining storage space of the cache and the total storage space is determined to be LOW level between 30% and 60%; the ratio of the remaining storage space of the cache to the total storage space The cache below 30% is determined to be the IDLE level.
  • the partitioning module 410 creates matching thread pools for each cache level, and each thread pool contains multiple threads. Multiple threads in each thread pool are used for data elimination processing of the corresponding level of cache. Because different levels of cache usage is different In order to optimize resource allocation as much as possible, the number of threads in the thread pool corresponding to different levels is also different.
  • the weight module 420 is configured to separately set a corresponding weight value for each thread pool, and set the number of threads included in each thread pool according to the weight value of each thread pool.
  • the weight module 420 sets the weight value corresponding to the thread pool according to the level of the cache level matched with the thread pool, wherein the cache level matching the thread pool The higher the value of the thread pool, the greater the weight value; conversely, the lower the cache level that matches the thread pool, the smaller the weight value of the thread pool.
  • the greater the weight value of the thread pool the greater the number of threads included in the thread pool; the smaller the weight value of the thread pool, the fewer the number of threads contained in the thread pool. Therefore, the number of threads contained in each thread pool is dynamically changing.
  • the scanning module 430 is configured to separately scan each cache by using multiple threads in each thread pool, and determine a cache level of each cache according to the scan result and the level division rule.
  • the embodiment of the present disclosure optimizes the workflow by classifying all caches and specifying the corresponding processing relationship between each thread and a different level of cache, thereby effectively avoiding the occurrence of the above conflict situation.
  • the scanning module 430 scans each cache by using multiple threads in each thread pool, and determines a cache level for each scanned cache according to the scan result and the level division rule, for subsequent targeted processing.
  • the setting method of the weight value of the thread pool may further include: periodically acquiring the scan result of each thread pool, determining the number of caches corresponding to each cache level according to the scan result; and then adjusting each thread according to the number of caches corresponding to each cache level.
  • the weight value of the pool and adjust the number of threads included in each thread pool according to the adjusted weight value of each thread pool.
  • the greater the number of caches corresponding to the cache level the greater the weight of the thread pool matching the cache level.
  • the smaller the cache level corresponding to the cache level the smaller the weight value of the thread pool matching the cache level. .
  • the weight value setting method of the thread pool provided in the weight module 420 and the scanning module 430 may be comprehensively combined to set a more reasonable weight value of the thread pool.
  • the weight value of the thread pool can be further determined according to various factors such as the type and importance of the cache of the corresponding level.
  • the elimination module 440 utilizes multiple threads in each thread pool to eliminate data in the cache whose cache level matches the thread pool.
  • the elimination module 440 performs data elimination processing on the cache with the corresponding cache level by using multiple threads in the thread pool matching the respective cache levels.
  • Each thread pool can process only one cache level cache.
  • the partition module 410 divides the cache into three levels of a HIGH level, a LOW level, and an IDLE level, so only three thread pools are required. Specifically, the thread pool 1 corresponds to the HIGH level, the thread pool 2 corresponds to the LOW level, and the thread pool 3 corresponds to the IDLE level.
  • each thread pool can also be used to handle multiple cache level caches.
  • the cache level includes six levels, it can also be processed by three thread pools, each of which handles two levels of cache.
  • the scanning module 430 and the eliminating module 440 can be repeatedly operated multiple times.
  • the scanning module 430 can be run once every preset first time interval, and the eliminating module 440 can be operated every preset second time interval. once.
  • the first time interval and the second time interval may be equal or may not be equal.
  • the first time interval and the second time interval may be either fixed values or dynamically changing values.
  • the first time interval may be dynamically adjusted according to the scan result: when the number of caches of the HIGH level is large in the scan result, the first time interval is reduced; when the number of caches of the HIGH level in the scan result is small, the first time is increased. interval.
  • each thread pool can perform the elimination operation on the cache of the corresponding level according to the same execution cycle, or perform the elimination operation on the cache of the corresponding level according to different execution cycles.
  • a thread pool for handling HIGH-level caches can perform data elimination operations with a shorter execution cycle to prevent insufficient free space for the HIGH-level cache; the thread pool for handling IDLE-level caches can be compared A long execution cycle performs data elimination operations to save system overhead.
  • the number of times of running and the running time of the scanning module 430 and the eliminating module 440 can be determined in a variety of manners by a person skilled in the art according to actual needs, which is not limited by the disclosure. It can be seen that the division of the cache level and the application of the thread pool technology provide more flexibility and controllability for the data elimination operation, and can meet the needs of various scenarios.
  • the specific method for the data elimination by the eliminating module 440 can be flexibly set by a person skilled in the art, which is not limited by the disclosure. For example, it can be eliminated based on various factors such as data write time, number of data writes, data temperature attributes, and data types.
  • the data elimination method may be: calculating a temperature attribute value of each data in the cache according to a total number of writes of each data in the cache and a preset temperature attribute calculation rule, and determining a cache according to the temperature attribute value. The order of elimination of each data within.
  • the preset temperature attribute calculation rule is a rule for calculating the popularity degree of each cached data set by a person skilled in the art according to actual conditions.
  • the popularity of the cached data can be determined by factors such as the total number of times the cached data is written, and/or the storage period of the cached data.
  • the temperature attribute value of each cached data may be separately calculated according to the total number of writes of each cached data; the temperature attribute value of each cached data may be further calculated in combination with other factors.
  • the present disclosure does not limit the specific calculation rule of the temperature attribute value, as long as it can meet the actual needs of the user.
  • the cached data having the lowest temperature attribute value is sequentially eliminated according to the calculated temperature attribute values from low to high, thereby achieving the effect of eliminating the data according to the popularity of the cached data. And release the cache space in a timely and effective manner.
  • the total number of times of writing may be divided into a plurality of numerical intervals in advance, and corresponding intervals are respectively set for each numerical interval.
  • the score is determined and the temperature attribute value is determined based on the interval score. For example, when the total number of writes belongs to the value range [0, 10], the interval score is 1; when the total number of writes belongs to the value range [10, 50], the interval score is 5; When the total number of entries is in the range of [50,100], the interval score is 10.
  • the foregoing preset temperature attribute calculation rule may further include: pre-setting the cache duration corresponding to the cache into one Steps are divided into a plurality of buffer periods, and corresponding period weight values are respectively set for each buffer period; for each cache data, a temperature attribute of the cache data is determined according to a period weight value of the corresponding buffer period when the cache data is written each time value.
  • the buffer duration may be: a length of time defined by a first data write time corresponding to the data with the oldest write time in the cache and a second data write time corresponding to the latest data of the write time.
  • the buffer duration can also be a preset length of time. For example, suppose a cache is dedicated to storing cached data within the last three hours, and is automatically deleted once the write cached data is written for more than three hours. , the cache cache time is 3 hours.
  • the cache duration is divided into multiple cache periods, the entire cache duration may be divided into multiple equal cache periods, or the entire cache duration may be divided into multiple cache periods.
  • a period data table corresponding to the buffer period may be separately set for each buffer period, wherein each period data table is used for recording. The cached data written during the corresponding cache period.
  • the weight values of the respective time periods may be set to be equal, so that the temperature attribute value of the cached data is calculated from the aspect of the number of occurrences of each cached data; or, the time period may be followed according to the cache time period.
  • the subsequent order correspondingly sets an incremented (or decremented) time slot weight value, such that the focus is on combining the number of occurrences of the cached data with the write time to calculate the temperature attribute value of each cached data.
  • the weight value setting of each time period is determined by a person skilled in the art according to actual conditions, and the disclosure does not limit this.
  • the setting of the buffer period and the time weight value enables the user to prioritize the elimination of data in non-critical time periods according to actual needs, so that the elimination scheme is more flexible.
  • the foregoing preset level division rules may be further divided according to other types of data stored in the cache, in addition to the storage space, and in summary, the manner of dividing the cache level and the weight setting of the thread pool in the present disclosure.
  • the method is not limited.
  • each of the above thread pools runs in parallel with each other, thereby further improving data processing efficiency.
  • the data elimination device based on multiple caches may be Dividing a plurality of cache levels according to a preset level division rule, respectively creating a matching thread pool for each cache level; scanning each cache separately by using multiple threads in each thread pool, and determining each cache according to the scan result and the level division rule Cache level; utilizes multiple threads in each thread pool to eliminate data in the cache with a cache level that matches the thread pool. Therefore, the problem of low efficiency of single-thread processing in the prior art is solved, and multi-thread parallel operation of data elimination operations on different weighted cache sets is realized, which ensures the consistency of data elimination and greatly improves the efficiency of data elimination processing. Through the improvement of the thread pool, the elimination priority of the cache set can be guaranteed in parallel.
  • FIG. 5 schematically illustrates a block diagram of a computing device for performing a multiple cache based data retirement method in accordance with an embodiment of the present disclosure.
  • the computing device conventionally includes a processor 510 and a computer program product or computer readable medium in the form of a storage device 520.
  • Storage device 520 can be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • Storage device 520 has a storage space 530 that stores program code 531 for performing any of the method steps described above.
  • storage space 530 storing program code may include various program code 531 for implementing various steps in the above methods, respectively.
  • the program code can be read from or written to one or more computer program products.
  • These computer program products include program code carriers such as a hard disk, a compact disk (CD), a memory card, or a floppy disk.
  • a computer program product is typically a portable or fixed storage unit such as that shown in FIG.
  • the storage unit may have storage segments, storage spaces, and the like that are similarly arranged to storage device 520 in the computing device of FIG.
  • the program code can be compressed, for example, in an appropriate form.
  • the storage unit includes computer readable code 531' for performing the steps of the method in accordance with the present disclosure, ie, code that can be read by a processor, such as 510, which when executed by the computing device causes the computing device Perform the various steps in the method described above.
  • modules in the devices of the embodiments can be adaptively changed and placed in one or more devices different from the embodiment.
  • the modules or units or components of the embodiments may be combined into one module or unit or component, and further they may be divided into a plurality of sub-modules or sub-units or sub-components.
  • any combination of the features disclosed in the specification, including the accompanying claims, the abstract and the drawings, and any methods so disclosed, or All processes or units of the device are combined.
  • Each feature disclosed in this specification (including the accompanying claims, the abstract and the drawings) may be replaced by alternative features that provide the same, equivalent or similar purpose.
  • Various component embodiments of the present disclosure may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof.
  • a microprocessor or digital signal processor may be used in practice to implement some of some or all of the components of a plurality of cache-based data elimining devices in accordance with embodiments of the present disclosure. Or all features.
  • the present disclosure may also be implemented as a device or device program (eg, a computer program and a computer program product) for performing some or all of the methods described herein.
  • Such a program implementing the present disclosure may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.

Abstract

A multiple buffer-based data elimination method and device. The method comprises: classifying, according to a preconfigured level classification rule, buffers into a plurality of buffer levels to create thread pools matching the respective buffer levels, wherein each of the thread pools contains a plurality of threads (S110); employing the plurality of threads in the respective thread pools to scan respective buffers, and determining, according to a scan result and the classification rule, the buffer levels of the respective buffers (S120); and employing the plurality of threads in the respective thread pools to eliminate data in the buffers matching the respective buffer levels and thread pools (S130).

Description

基于多个缓存的数据淘汰方法及装置Data elimination method and device based on multiple caches
相关申请的交叉参考Cross-reference to related applications
本申请要求于2016年12月29日提交中国专利局、申请号为201611246005.8、名称为“一种基于多个缓存的数据淘汰方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese Patent Application No. 201611246005.8, entitled "A Multi-Cache Based Data Elimination Method and Apparatus", filed on December 29, 2016, the entire contents of which are incorporated by reference. In this application.
技术领域Technical field
本公开涉及计算机技术领域,具体涉及一种基于多个缓存的数据淘汰方法及装置。The present disclosure relates to the field of computer technologies, and in particular, to a data elimination method and apparatus based on multiple caches.
背景技术Background technique
缓存(Cache)是一种为了解决高、低速设备之间速度不匹配而采用的一项重要技术,广泛应用于各种领域如存储系统、数据库、网页(web)服务器、处理器、文件系统、磁盘系统等,可以减少应用响应时间、提升效率。但是,实现Cache技术所用到的存储介质如RAM、SSD等,在具有更高性能的同时,价格也较昂贵,出于性价比的考虑,Cache的容量大小受到限制,因此需要有效的管理Cache空间,于是出现了多种Cache淘汰算法,例如:最近最少使用(Least Recently Used,简称LRU)淘汰算法;最近最不频繁使用(Least Frequently Used,简称LFU)淘汰算法;最近最多使用(Most Recently Used,简称MRU)淘汰算法;自适应Cache(Adaptive Replacement Cache,简称ARC)淘汰算法等。Cache is an important technology used to solve the speed mismatch between high and low speed devices. It is widely used in various fields such as storage systems, databases, web servers, processors, file systems, Disk systems, etc., can reduce application response time and improve efficiency. However, the storage medium used in the Cache technology, such as RAM and SSD, has higher performance and is more expensive. For the sake of cost performance, the capacity of the Cache is limited, so it is necessary to effectively manage the Cache space. Therefore, a variety of Cache elimination algorithms have emerged, such as: Least Recently Used (LRU) elimination algorithm; Recently Least Frequently Used (LFU) elimination algorithm; Most recently used (Most Recently Used) MRU) elimination algorithm; adaptive Cache (Adaptive Replacement Cache, ARC) elimination algorithm.
但是,发明人在实现本公开的过程中,发现在现有技术中至少存在如下问题:现有技术的淘汰算法一般为单线程处理方式,处理效率较低,而且较低的处理效率有时会导致有限的缓存空间被用完后不能及时腾空,无法及时地存储后续数据。However, in the process of implementing the present disclosure, the inventors have found that at least the following problems exist in the prior art: the prior art elimination algorithm is generally a single-threaded processing method, and the processing efficiency is low, and the lower processing efficiency sometimes causes After the limited cache space is used up, it cannot be vacated in time, and the subsequent data cannot be stored in time.
发明内容Summary of the invention
鉴于上述问题,提出了本公开以便提供一种克服上述问题或者至少部分地解决上述问题的一种基于多个缓存的数据淘汰方法及相应的装置。 In view of the above problems, the present disclosure has been made in order to provide a multi-cache based data elimination method and corresponding apparatus that overcomes the above problems or at least partially solves the above problems.
根据本公开的一个方面,提供了一种基于多个缓存的数据淘汰方法,包括:According to an aspect of the present disclosure, a data culling method based on multiple caches is provided, including:
按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池;其中,每个线程池中包含多个线程;Dividing a plurality of cache levels according to a preset level division rule, respectively creating a matching thread pool for each cache level; wherein each thread pool includes multiple threads;
利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级;Each cache is scanned by multiple threads in each thread pool, and the cache level of each cache is determined according to the scan result and the level division rule;
利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。The data in the cache with the cache level matching the thread pool is eliminated by using multiple threads in each thread pool.
根据本公开的另一方面,提供了一种基于多个缓存的数据淘汰装置,包括:According to another aspect of the present disclosure, a data culling device based on a plurality of caches is provided, including:
划分模块,用于按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池;其中,每个线程池中包含多个线程;a dividing module, configured to divide a plurality of cache levels according to a preset level dividing rule, and respectively create a matching thread pool for each cache level; wherein each thread pool includes multiple threads;
扫描模块,用于利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级;a scanning module, configured to scan each cache separately by using multiple threads in each thread pool, and determine a cache level of each cache according to the scan result and the level division rule;
淘汰模块,用于利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。The culling module is configured to use multiple threads in each thread pool to eliminate data in a cache whose cache level matches the thread pool.
根据本公开的第三方面提供了一种计算机程序,包括:According to a third aspect of the present disclosure, a computer program is provided, comprising:
计算机可读代码,当计算机可读代码在计算设备上运行时,导致计算设备执行上述基于多个缓存的数据淘汰方法。Computer readable code, when the computer readable code is run on a computing device, causes the computing device to perform the plurality of cache based data culling methods described above.
根据本公开的第四方面提供了一种计算机可读介质,包括:According to a fourth aspect of the present disclosure, a computer readable medium comprising:
存储了上述执行上述基于多个缓存的数据淘汰方法的计算机程序。A computer program for performing the above-described multi-cache based data elimination method is stored.
根据本公开的一种基于多个缓存的数据淘汰方法及装置可以按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池;利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级;利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。由此可见,通过将缓存划分为多个缓存等级,并分别针对各个缓存等级创建对应的线程池,能够更好地根据缓存等级调整线程池中的线程数量;并且,通过多个线程池并行处理的方式大大提高了数据淘汰处理效率。According to the disclosure, a method and an apparatus for eliminating data based on multiple caches may divide multiple cache levels according to a preset level division rule, respectively create matching thread pools for each cache level; and utilize multiple threads in each thread pool. Each cache is scanned separately, and the cache level of each cache is determined according to the scan result and the level division rule; and the data in the cache whose cache level matches the thread pool is eliminated by using multiple threads in each thread pool. It can be seen that by dividing the cache into multiple cache levels and respectively creating corresponding thread pools for each cache level, the number of threads in the thread pool can be better adjusted according to the cache level; and, through multiple thread pools, parallel processing The way to greatly improve the efficiency of data elimination processing.
上述说明仅是本公开技术方案的概述,为了能够更清楚了解本公开的技 术手段,而可依照说明书的内容予以实施,并且为了让本公开的上述和其它目的、特征和优点能够更明显易懂,以下特举本公开的具体实施方式。The above description is only an overview of the technical solutions of the present disclosure, in order to be able to understand the technology of the present disclosure more clearly. The above and other objects, features, and advantages of the present invention will be apparent from the description and appended claims.
附图概述BRIEF abstract
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本公开的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those skilled in the art from a The drawings are only for the purpose of illustrating the preferred embodiments and are not to be considered as limiting. Throughout the drawings, the same reference numerals are used to refer to the same parts. In the drawing:
图1是本公开实施例一的一种基于多个缓存的数据淘汰方法的流程示意图;1 is a schematic flow chart of a data elimination method based on multiple caches according to Embodiment 1 of the present disclosure;
图2是本公开实施例二的一种基于多个缓存的数据淘汰方法的流程示意图;2 is a schematic flowchart of a data elimination method based on multiple caches according to Embodiment 2 of the present disclosure;
图3是本公开实施例三的一种基于多个缓存的数据淘汰装置的结构示意图;3 is a schematic structural diagram of a data elimination device based on multiple caches according to Embodiment 3 of the present disclosure;
图4是本公开实施例四的一种基于多个缓存的数据淘汰装置的结构示意图;4 is a schematic structural diagram of a data elimination device based on multiple caches according to Embodiment 4 of the present disclosure;
图5示意性地示出了用于执行根据本公开实施例的基于多个缓存的数据淘汰方法的计算设备的框图;以及FIG. 5 schematically illustrates a block diagram of a computing device for performing a multiple cache based data retirement method in accordance with an embodiment of the present disclosure;
图6示意性地示出了用于保持或者携带实现根据本公开实施例的基于多个缓存的数据淘汰方法的程序代码的存储单元。FIG. 6 schematically illustrates a storage unit for maintaining or carrying program code that implements a plurality of cache-based data retirement methods in accordance with an embodiment of the present disclosure.
本发明的较佳实施方式Preferred embodiment of the invention
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the embodiments of the present invention have been shown in the drawings, the embodiments Rather, these embodiments are provided so that this disclosure will be more fully understood and the scope of the disclosure will be fully disclosed.
实施例一Embodiment 1
图1示出了本公开实施例一提供的一种基于多个缓存的数据淘汰方法的流程示意图,如图所示,该方法包括: FIG. 1 is a schematic flowchart of a method for eliminating data based on multiple caches according to Embodiment 1 of the present disclosure. As shown in the figure, the method includes:
步骤S110:按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池。Step S110: Divide a plurality of cache levels according to a preset level division rule, and respectively create matching thread pools for each cache level.
其中,预设的等级划分规则用于将各个缓存根据其不同的使用情况来划分成不同等级,同一等级内的各个缓存具有相近的使用情况。该等级是由技术人员人为划定的。对于预设的等级划分规则的具体内容,本公开实施例一对此不作具体限定,本领域技术人员可以根据实际情况灵活设定。The preset level division rule is used to divide each cache into different levels according to different usage conditions, and each cache in the same level has similar usage. This level is artificially defined by the technician. For the specific content of the preset grading rule, the first embodiment of the present disclosure does not specifically limit this, and those skilled in the art can flexibly set according to actual conditions.
为了提高数据淘汰的处理效率,分别为各个缓存等级创建匹配的线程池,每个线程池中都包含多个线程。每个线程池中的多个线程均用于对应等级的缓存的数据淘汰处理。因为不同等级的缓存的使用情况不一样,为了尽可能地优化资源配置,所以不同等级对应的线程池中的线程个数也可以不同。In order to improve the processing efficiency of data elimination, a matching thread pool is created for each cache level, and each thread pool contains multiple threads. Multiple threads in each thread pool are used for data elimination processing of the corresponding level of cache. Because different levels of cache usage are different, in order to optimize resource allocation as much as possible, the number of threads in the thread pool corresponding to different levels may also be different.
之所以采用线程池技术,是因为如果为每一个缓存都设置相应的线程进行处理,将消耗大量系统资源,不具有现实操作性。The reason why the thread pool technology is adopted is that if the corresponding thread is set for each cache, it will consume a lot of system resources and has no practical operation.
步骤S120:利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级。Step S120: Scan each cache separately by using multiple threads in each thread pool, and determine the cache level of each cache according to the scan result and the level division rule.
因为当多个线程对所有的缓存进行数据淘汰处理时,如果不对线程与缓存的匹配关系进行限定,就可能出现两个线程对同一个缓存同时进行处理的情况,此时,两个线程就会产生冲突,导致一系列问题。所以,本公开实施例通过对所有缓存进行等级划分,并规定各个线程与不同等级的缓存的对应处理关系,从而有效避免了上述冲突情况的发生,优化了工作流程。Because when multiple threads perform data elimination processing on all caches, if the matching relationship between the threads and the cache is not limited, two threads may simultaneously process the same cache. At this time, the two threads will Conflicts lead to a series of problems. Therefore, the embodiment of the present disclosure optimizes the workflow by classifying all caches and specifying the corresponding processing relationship between each thread and a different level of cache, thereby effectively avoiding the occurrence of the above conflict situation.
具体地,利用各个线程池中的多个线程,分别扫描各个缓存,根据扫描结果和等级划分规则为每一个被扫描过的缓存确定缓存等级,用于后续具有针对性的处理。Specifically, each cache is scanned by using multiple threads in each thread pool, and the cache level is determined for each scanned cache according to the scan result and the level division rule, for subsequent targeted processing.
步骤S130:利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。Step S130: Using multiple threads in each thread pool to eliminate data in the cache whose cache level matches the thread pool.
具体地,根据步骤S120确定的缓存等级,利用与各个缓存等级相匹配的线程池中的多个线程对具有对应缓存等级的缓存进行数据淘汰处理。对于数据淘汰处理的具体方法,本公开实施例一对此不作具体限定,本领域技术人员可以根据实际情况灵活设定。Specifically, according to the cache level determined in step S120, the data having the corresponding cache level is subjected to data elimination processing by using a plurality of threads in the thread pool matching the respective cache levels. For the specific method of the data elimination processing, the first embodiment of the present disclosure does not specifically limit this, and those skilled in the art can flexibly set according to actual conditions.
由此可见,本公开实施例提供的一种基于多个缓存的数据淘汰方法可以按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配 的线程池;利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级;利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。由此可见,通过将缓存划分为多个缓存等级,并分别针对各个缓存等级创建对应的线程池,能够更好地根据缓存等级调整线程池中的线程数量;并且,通过多个线程池并行处理的方式大大提高了数据淘汰处理效率。Therefore, the data elimination method based on multiple caches provided by the embodiment of the present disclosure may divide multiple cache levels according to a preset level division rule, and respectively create a match for each cache level. Thread pool; using multiple threads in each thread pool to scan each cache separately, determining the cache level of each cache according to the scan result and the hierarchical rule; using multiple threads in each thread pool to match the cache level with the thread pool The data in the cache is eliminated. It can be seen that by dividing the cache into multiple cache levels and respectively creating corresponding thread pools for each cache level, the number of threads in the thread pool can be better adjusted according to the cache level; and, through multiple thread pools, parallel processing The way to greatly improve the efficiency of data elimination processing.
实施例二Embodiment 2
图2示出了本公开实施例二提供的一种基于多个缓存的数据淘汰方法的流程示意图,如图所示,该方法包括:FIG. 2 is a schematic flowchart of a method for eliminating data based on multiple caches according to Embodiment 2 of the present disclosure. As shown in the figure, the method includes:
步骤S210:按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池。Step S210: Divide a plurality of cache levels according to a preset level division rule, and respectively create matching thread pools for each cache level.
其中,预设的等级划分规则用于将各个缓存根据其不同的使用情况来划分成不同等级,同一等级内的各个缓存具有相近的使用情况。在本公开实施例中,该等级划分规则包括:按照缓存的剩余存储空间与总存储空间之间的比值划分缓存等级,其中,剩余存储空间与总存储空间之间的比值越大,缓存等级越高;剩余存储空间与总存储空间之间的比值越小,缓存等级越低。例如,假设缓存等级分为三级,分别为高(HIGH)级别、低(LOW)级别和空闲(IDLE)级别,其中,将缓存的剩余存储空间与总存储空间之间的比值在60%以上的缓存确定为HIGH级别;将缓存的剩余存储空间与总存储空间之间的比值在30%与60%之间的缓存确定为LOW级别;将缓存的剩余存储空间与总存储空间之间的比值在30%以下的缓存确定为IDLE级别。The preset level division rule is used to divide each cache into different levels according to different usage conditions, and each cache in the same level has similar usage. In the embodiment of the present disclosure, the level dividing rule includes: dividing a cache level according to a ratio between a remaining storage space of the cache and a total storage space, wherein a ratio between the remaining storage space and the total storage space is larger, and the cache level is higher. High; the smaller the ratio between the remaining storage space and the total storage space, the lower the cache level. For example, suppose the cache level is divided into three levels, which are high (HIGH) level, low (LOW) level, and idle (IDLE) level, where the ratio of the remaining storage space of the cache to the total storage space is above 60%. The cache is determined to be the HIGH level; the cache between the remaining storage space of the cache and the total storage space is determined to be LOW level between 30% and 60%; the ratio of the remaining storage space of the cache to the total storage space The cache below 30% is determined to be the IDLE level.
为了提高数据淘汰的处理效率,分别为各个缓存等级创建匹配的线程池,每个线程池中都包含多个线程。每个线程池中的多个线程均用于对应等级的缓存的数据淘汰处理。因为不同等级的缓存的使用情况不一样,为了尽可能地优化资源配置,所以不同等级对应的线程池中的线程个数也不相同。In order to improve the processing efficiency of data elimination, a matching thread pool is created for each cache level, and each thread pool contains multiple threads. Multiple threads in each thread pool are used for data elimination processing of the corresponding level of cache. Because the usage of different levels of cache is different, in order to optimize the resource configuration as much as possible, the number of threads in the thread pool corresponding to different levels is also different.
步骤S220:为各个线程池分别设置对应的权重值,根据各个线程池的权重值设置各个线程池内包含的线程的数量。Step S220: respectively set corresponding weight values for each thread pool, and set the number of threads included in each thread pool according to the weight values of the respective thread pools.
对于权重值的具体设定方法,可以是针对每个线程池,根据与该线程池匹配的缓存等级的高低设置该线程池对应的权重值,其中,与该线程池匹配的缓存等级越高,该线程池的权重值越大;相反的,与该线程池匹配的缓存 等级越低,该线程池的权重值也就越小。其中,线程池的权重值越大,线程池内包含的线程的数量越多;线程池的权重值越小,线程池内包含的线程的数量就越少。因此,每个线程池内包含的线程数量都是动态变化的。For the specific setting method of the weight value, for each thread pool, the weight value corresponding to the thread pool is set according to the level of the cache level matched with the thread pool, wherein the cache level matching the thread pool is higher. The thread pool has a higher weight value; instead, the cache matches the thread pool The lower the level, the smaller the weight value of the thread pool. The greater the weight value of the thread pool, the greater the number of threads included in the thread pool; the smaller the weight value of the thread pool, the fewer the number of threads contained in the thread pool. Therefore, the number of threads contained in each thread pool is dynamically changing.
步骤S230:利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级。Step S230: Scan each cache separately by using multiple threads in each thread pool, and determine the cache level of each cache according to the scan result and the level division rule.
因为当多个线程对所有的缓存进行数据淘汰处理时,如果不对线程与缓存的匹配关系进行限定,就会出现两个线程对同一个缓存进行处理的情况,此时,两个线程就会产生冲突,导致一系列问题。所以,本公开实施例通过对所有缓存进行等级划分,并规定各个线程与不同等级的缓存的对应处理关系,从而有效避免了上述冲突情况的发生,优化了工作流程。Because when multiple threads perform data elimination processing on all caches, if the matching relationship between the threads and the cache is not limited, two threads will process the same cache. At this time, two threads will be generated. Conflicts lead to a series of problems. Therefore, the embodiment of the present disclosure optimizes the workflow by classifying all caches and specifying the corresponding processing relationship between each thread and a different level of cache, thereby effectively avoiding the occurrence of the above conflict situation.
具体地,利用各个线程池中的多个线程,分别扫描各个缓存,根据扫描结果和等级划分规则为每一个被扫描过的缓存确定缓存等级,用于后续具有针对性的处理。Specifically, each cache is scanned by using multiple threads in each thread pool, and the cache level is determined for each scanned cache according to the scan result and the level division rule, for subsequent targeted processing.
相应地,对于线程池的权重值的设定方法还可以包括:定期获取各个线程池的扫描结果,根据扫描结果确定各个缓存等级对应的缓存数量;然后根据各个缓存等级对应的缓存数量调整各个线程池的权重值,并根据各个线程池调整后的权重值调整各个线程池内包含的线程的数量。其中,缓存等级对应的缓存数量越多,与该缓存等级匹配的线程池的权重值越大;相反的,缓存等级对应的缓存数量越少,与该缓存等级匹配的线程池的权重值越小。通过缓存数量来确定线程池的权重值,从而决定每个线程池中包含的线程数量,可以使得每个线程池中的线程数量能够准确地满足对应缓存等级中各个缓存的处理操作,使资源得到合理使用,节省成本。Correspondingly, the setting method of the weight value of the thread pool may further include: periodically acquiring the scan result of each thread pool, determining the number of caches corresponding to each cache level according to the scan result; and then adjusting each thread according to the number of caches corresponding to each cache level. The weight value of the pool, and adjust the number of threads included in each thread pool according to the adjusted weight value of each thread pool. The greater the number of caches corresponding to the cache level, the greater the weight of the thread pool matching the cache level. Conversely, the smaller the cache level corresponding to the cache level, the smaller the weight value of the thread pool matching the cache level. . Determining the weight value of the thread pool by the number of caches, thereby determining the number of threads included in each thread pool, so that the number of threads in each thread pool can accurately satisfy the processing operations of each cache in the corresponding cache level, so that the resources are obtained. Reasonable use and cost saving.
在其他实施例中,还可以综合采用步骤S220和步骤S230中提供的线程池的权重值设定方法,从而设置更加合理的线程池的权重值。另外,线程池的权重值还可以进一步根据对应等级的缓存的类型、重要程度等多种因素进行确定。In other embodiments, the weight value setting method of the thread pool provided in step S220 and step S230 may be comprehensively used to set a more reasonable weight value of the thread pool. In addition, the weight value of the thread pool can be further determined according to various factors such as the type and importance of the cache of the corresponding level.
步骤S240:利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。Step S240: Using multiple threads in each thread pool to eliminate data in the cache whose cache level matches the thread pool.
具体地,根据上述步骤确定的缓存等级,利用与各个缓存等级相匹配的线程池中的多个线程对具有对应缓存等级的缓存进行数据淘汰处理。其中, 每个线程池可以仅处理一个缓存等级的缓存,例如,步骤S210中将缓存分为HIGH级别、LOW级别和IDLE级别一共三个级别,所以仅需三个线程池与之对应。具体的,线程池1对应HIGH级别,线程池2对应LOW级别,线程池3对应IDLE级别,在这种情况下,线程池1中的所有线程仅处理HIGH级别中的所有缓存,线程池2中所有线程仅处理LOW级别中的所有缓存,线程池3中所有线程仅处理IDLE级别中的所有缓存。当然,当缓存等级较多时,每个线程池也可以用于处理多个缓存等级的缓存。例如,当缓存等级包括六个级别时,也可以由三个线程池进行处理,每个线程池分别处理两个等级的缓存。Specifically, according to the cache level determined in the above step, the data having the corresponding cache level is subjected to data elimination processing by using a plurality of threads in the thread pool matching the respective cache levels. among them, Each thread pool can process only one cache level cache. For example, in step S210, the cache is divided into three levels: a HIGH level, a LOW level, and an IDLE level, so only three thread pools are required. Specifically, the thread pool 1 corresponds to the HIGH level, the thread pool 2 corresponds to the LOW level, and the thread pool 3 corresponds to the IDLE level. In this case, all the threads in the thread pool 1 only process all the caches in the HIGH level, and the thread pool 2 All threads only process all caches in the LOW level, and all threads in thread pool 3 only process all caches in the IDLE level. Of course, when there are more cache levels, each thread pool can also be used to handle multiple cache level caches. For example, when the cache level includes six levels, it can also be processed by three thread pools, each of which handles two levels of cache.
总之,通过缓存等级的划分以及线程池技术的应用,能够更加灵活地实现缓存的扫描及数据淘汰工作。另外,上述的步骤S230以及步骤S240均可以反复多次执行,例如,步骤S230可以每隔预设的第一时间间隔执行一次,步骤S240可以每隔预设的第二时间间隔执行一次。其中,第一时间间隔与第二时间间隔可以相等,也可以不等。另外,第一时间间隔和第二时间间隔既可以是固定值,也可以是动态变化的数值。例如,第一时间间隔可以根据扫描结果进行动态调整:当扫描结果中HIGH级别的缓存数量较多时,缩小第一时间间隔;当扫描结果中HIGH级别的缓存数量较少时,增大第一时间间隔。另外,在步骤S240的每次执行过程中,各个线程池既可以按照相同的执行周期对相应等级的缓存执行淘汰操作,也可以按照不同的执行周期对相应等级的缓存执行淘汰操作。例如,用于处理HIGH级别的缓存的线程池可以按照较短的执行周期进行数据淘汰操作,以防止HIGH级别的缓存的可用空间不足;用于处理IDLE级别的缓存的线程池可以按照较长的执行周期进行数据淘汰操作,以节省系统开销。总之,本领域技术人员可根据实际需要灵活采用各种方式确定上述的步骤S230以及步骤S240的执行次数以及执行时机,本公开对此不做限定。由此可见,通过缓存等级的划分以及线程池技术的应用,为数据淘汰操作提供了更多的灵活性和可控性,能够满足各类场景的需求。In short, through the division of the cache level and the application of the thread pool technology, the cache scanning and data elimination work can be more flexibly implemented. In addition, the above steps S230 and S240 may be repeatedly performed multiple times. For example, step S230 may be performed once every preset first time interval, and step S240 may be performed once every preset second time interval. The first time interval and the second time interval may be equal or may not be equal. In addition, the first time interval and the second time interval may be either fixed values or dynamically changing values. For example, the first time interval may be dynamically adjusted according to the scan result: when the number of caches of the HIGH level is large in the scan result, the first time interval is reduced; when the number of caches of the HIGH level in the scan result is small, the first time is increased. interval. In addition, in each execution process of step S240, each thread pool may perform the phase-out operation on the cache of the corresponding level according to the same execution cycle, or perform the phase-out operation on the cache of the corresponding level according to different execution cycles. For example, a thread pool for handling HIGH-level caches can perform data elimination operations with a shorter execution cycle to prevent insufficient free space for the HIGH-level cache; thread pools for handling IDLE-level caches can be longer. The execution cycle performs data elimination operations to save system overhead. In summary, the number of executions and the execution timing of the above-mentioned step S230 and step S240 can be determined in a variety of manners by a person skilled in the art according to actual needs, which is not limited by the disclosure. It can be seen that the division of the cache level and the application of the thread pool technology provide more flexibility and controllability for the data elimination operation, and can meet the needs of various scenarios.
在本公开实施例中,进行数据淘汰的具体方法可以由本领域技术人员灵活设置,本公开对此不做限定。例如,可以根据数据写入时间、数据写入次数、数据温度属性、数据类型等多种因素进行淘汰。在本实施例中,数据淘 汰方法可以是:根据缓存内的各个数据的写入总次数以及预设的温度属性计算规则,计算缓存内的各个数据的温度属性值,并根据温度属性值确定缓存内的各个数据的淘汰顺序。In the embodiment of the present disclosure, a specific method for performing data elimination may be flexibly set by a person skilled in the art, which is not limited in the disclosure. For example, it can be eliminated based on various factors such as data write time, number of data writes, data temperature attributes, and data types. In this embodiment, data mining The method may be: calculating a temperature attribute value of each data in the cache according to a total number of times of writing each data in the cache and a preset temperature attribute calculation rule, and determining an elimination order of each data in the cache according to the temperature attribute value. .
其中,预设的温度属性计算规则为本领域技术人员根据实际情况所设置的计算各个缓存数据的热门程度的规则。在这里,缓存数据的热门程度可以通过缓存数据被写入的总次数、和/或缓存数据的存储时段等因素进行确定。具体地,在计算各个缓存数据的温度属性值时,可以单独根据各个缓存数据的写入总次数计算各个缓存数据的温度属性值;也可以进一步结合其他因素计算各个缓存数据的温度属性值。本公开对温度属性值的具体计算规则不做限定,只要能够满足用户的实际需求即可。The preset temperature attribute calculation rule is a rule for calculating the popularity degree of each cached data set by a person skilled in the art according to actual conditions. Here, the popularity of the cached data can be determined by factors such as the total number of times the cached data is written, and/or the storage period of the cached data. Specifically, when calculating the temperature attribute value of each cached data, the temperature attribute value of each cached data may be separately calculated according to the total number of writes of each cached data; the temperature attribute value of each cached data may be further calculated in combination with other factors. The present disclosure does not limit the specific calculation rule of the temperature attribute value, as long as it can meet the actual needs of the user.
在计算出各个缓存数据的温度属性值之后,按照上述计算的温度属性值从低到高的顺序,依次淘汰温度属性值最低的缓存数据,以此实现根据缓存数据热门程度来淘汰数据的效果,并且及时有效地释放缓存空间。After calculating the temperature attribute values of the respective cache data, the cached data having the lowest temperature attribute value is sequentially eliminated according to the calculated temperature attribute values from low to high, thereby achieving the effect of eliminating the data according to the popularity of the cached data. And release the cache space in a timely and effective manner.
另外,本领域技术人员还可以对上述方案进行各种改动和变形。例如,在根据写入总次数确定温度属性时,除了直接根据写入总次数的数值进行确定外,还可以预先将写入总次数划分为多个数值区间,为各个数值区间分别设置对应的区间分值,并根据该区间分值确定温度属性值。例如,当写入总次数属于【0,10】这一数值区间时,区间分值为1;当写入总次数属于【10,50】这一数值区间时,区间分值为5;当写入总次数属于【50,100】这一数值区间时,区间分值为10。通过区间分值能够更加灵活地将写入总次数位于某一区间内的数据确定为热门数据。而且,为了使数据淘汰方式更为灵活,上述预设的温度属性计算规则还可以包括:预先将缓存对应的缓存时长进一步划分为多个缓存时段,为各个缓存时段分别设置对应的时段权重值;针对每个缓存数据,根据该缓存数据各次写入时对应的缓存时段的时段权重值确定该缓存数据的温度属性值。缓存时长可以为:由缓存中写入时间最早的数据所对应的第一数据写入时间和写入时间最晚的数据所对应的第二数据写入时间所界定的时间长度。而且,缓存时长也可以是预先设定的时间长度,例如,假设一个缓存专用于存储最近的三小时之内的缓存数据,一旦写入缓存的缓存数据的写入时间超过三小时则会自动删除,则该缓存的缓存时长为3小时。在将缓存时长划分为多个缓存时段时,可以将整个缓存时长划分为多 个均等的缓存时段,也可以将整个缓存时长划分为多个不等的缓存时段。为了便于根据缓存时段计算缓存数据的温度属性,在进行上述划分之后,可选地,还可以分别针对每个缓存时段设置与该缓存时段对应的时段数据表,其中,各个时段数据表用于记录相应的缓存时段内写入的缓存数据。为了便于根据缓存时段确定缓存数据的淘汰顺序,在本实施例中,还需要为上述划分的各个缓存时段设置对应的时段权重值,其设置方式也是多样的。具体地,可以将各个时段的权重值设置为均等的,这样更侧重于从各个缓存数据出现的次数这一方面去计算缓存数据的温度属性值;或者,也可以按照缓存时段在时间上从前往后的顺序对应地设置递增的(或递减的)时段权重值,这样侧重于将缓存数据的出现次数与写入时间进行结合来计算各个缓存的数据的温度属性值。在这里,各个时段的权重值设置由本领域技术人员根据实际情况而定,本公开对此不作限制。总之,通过缓存时段以及时段权重值的设置方式,使用户能够根据实际需求优先淘汰非重要时段内的数据,使淘汰方案更为灵活。In addition, various modifications and changes may be made to the above described embodiments. For example, when the temperature attribute is determined according to the total number of times of writing, in addition to directly determining the value according to the total number of times of writing, the total number of times of writing may be divided into a plurality of numerical intervals in advance, and corresponding intervals are respectively set for each numerical interval. The score is determined and the temperature attribute value is determined based on the interval score. For example, when the total number of writes belongs to the value range [0, 10], the interval score is 1; when the total number of writes belongs to the value range [10, 50], the interval score is 5; When the total number of entries is in the range of [50,100], the interval score is 10. The interval score can be used to more flexibly determine the data whose total number of writes is within a certain interval as hot data. Moreover, in order to make the data elimination mode more flexible, the preset temperature attribute calculation rule may further include: further dividing the cache duration corresponding to the cache into a plurality of buffer periods, and respectively setting corresponding period weight values for each buffer period; For each cached data, the temperature attribute value of the cached data is determined according to the period weight value of the corresponding cache period when the cache data is written each time. The buffer duration may be: a length of time defined by a first data write time corresponding to the data with the oldest write time in the cache and a second data write time corresponding to the latest data of the write time. Moreover, the buffer duration can also be a preset length of time. For example, suppose a cache is dedicated to storing cached data within the last three hours, and is automatically deleted once the write cached data is written for more than three hours. , the cache cache time is 3 hours. When the cache duration is divided into multiple cache periods, the entire cache duration can be divided into multiple An equal cache period can also divide the entire cache duration into multiple cache periods. In order to facilitate the calculation of the temperature attribute of the cache data according to the buffer period, after performing the above division, optionally, a period data table corresponding to the buffer period may be separately set for each buffer period, wherein each period data table is used for recording. The cached data written during the corresponding cache period. In this embodiment, it is also required to set a corresponding time slot weight value for each of the divided buffer periods in the present embodiment, and the setting manner thereof is also diverse. Specifically, the weight values of the respective time periods may be set to be equal, so that the temperature attribute value of the cached data is calculated from the aspect of the number of occurrences of each cached data; or, the time period may be followed according to the cache time period. The subsequent order correspondingly sets an incremented (or decremented) time slot weight value, such that the focus is on combining the number of occurrences of the cached data with the write time to calculate the temperature attribute value of each cached data. Here, the weight value setting of each time period is determined by a person skilled in the art according to actual conditions, and the disclosure does not limit this. In short, the setting of the buffer period and the time weight value enables the user to prioritize the elimination of data in non-critical time periods according to actual needs, so that the elimination scheme is more flexible.
另外,上述预设的等级划分规则除了可以按照存储空间进行划分外,还可以进一步根据缓存内存储的数据类型等其他因素进行划分,总之,本公开对缓存等级的划分方式以及线程池的权重设置方式不做限定。In addition, the foregoing preset level division rules may be further divided according to other types of data stored in the cache, in addition to the storage space, and in summary, the manner of dividing the cache level and the weight setting of the thread pool in the present disclosure. The method is not limited.
在本公开实施例中,上述各个线程池之间均相互并行运行,由此可以使数据处理效率进一步提高。In the embodiment of the present disclosure, each of the above thread pools runs in parallel with each other, thereby further improving data processing efficiency.
由此可见,本公开实施例二提供的一种基于多个缓存的数据淘汰方法可以按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池;利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级;利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。由此解决了现有技术中单线程处理效率低下的问题,实现了在不同权重的缓存集合上多线程并行同时进行数据淘汰操作,在保证一致性的同时大大提高了数据淘汰处理效率,同时,通过对线程池的改进,可以在并行的同时保证缓存集合的淘汰优先级。Therefore, the data elimination method based on multiple caches provided in Embodiment 2 of the present disclosure may divide multiple cache levels according to a preset level division rule, respectively create matching thread pools for each cache level; use each thread pool. Each of the multiple threads scans each cache, determines the cache level of each cache according to the scan result and the level division rule; and uses multiple threads in each thread pool to eliminate the data in the cache whose cache level matches the thread pool. Therefore, the problem of low efficiency of single-thread processing in the prior art is solved, and multi-thread parallel operation of data elimination operations on different weighted cache sets is realized, which ensures the consistency of data elimination and greatly improves the efficiency of data elimination processing. Through the improvement of the thread pool, the elimination priority of the cache set can be guaranteed in parallel.
实施例三Embodiment 3
图3示出了本公开实施例三提供的一种基于多个缓存的数据淘汰装置的 结构示意图,如图所示,该装置包括:划分模块310、扫描模块320和淘汰模块330。FIG. 3 shows a data eliminator based on multiple caches provided by Embodiment 3 of the present disclosure. Schematic diagram of the structure, as shown, the apparatus includes: a partitioning module 310, a scanning module 320, and a culling module 330.
划分模块310,用于按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池。The dividing module 310 is configured to divide a plurality of cache levels according to a preset level dividing rule, and respectively create matching thread pools for each cache level.
其中,预设的等级划分规则用于将各个缓存根据其不同的使用情况来划分成不同等级,同一等级内的各个缓存具有相近的使用情况。该等级是由技术人员人为划定的。对于预设的等级划分规则的具体内容,本公开实施例三对此不作具体限定,本领域技术人员可以根据实际情况灵活设定。The preset level division rule is used to divide each cache into different levels according to different usage conditions, and each cache in the same level has similar usage. This level is artificially defined by the technician. For the specific content of the preset level division rule, the third embodiment of the present disclosure does not specifically limit this, and those skilled in the art can flexibly set according to actual conditions.
为了提高数据淘汰的处理效率,划分模块310分别为各个缓存等级创建匹配的线程池,每个线程池中都包含多个线程。每个线程池中的多个线程均用于对应等级的缓存的数据淘汰处理。因为不同等级的缓存的使用情况不一样,为了尽可能地优化资源配置,所以不同等级对应的线程池中的线程个数也可以不同。In order to improve the processing efficiency of data elimination, the partitioning module 310 respectively creates matching thread pools for each cache level, and each thread pool contains multiple threads. Multiple threads in each thread pool are used for data elimination processing of the corresponding level of cache. Because different levels of cache usage are different, in order to optimize resource allocation as much as possible, the number of threads in the thread pool corresponding to different levels may also be different.
之所以采用线程池技术,是因为如果为每一个缓存都设置相应的线程进行处理,将消耗大量系统资源,不具有现实操作性。The reason why the thread pool technology is adopted is that if the corresponding thread is set for each cache, it will consume a lot of system resources and has no practical operation.
扫描模块320,用于利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级。The scanning module 320 is configured to separately scan each cache by using multiple threads in each thread pool, and determine the cache level of each cache according to the scan result and the level division rule.
因为当多个线程对所有的缓存进行数据淘汰处理时,如果不对线程与缓存的匹配关系进行限定,就可能出现两个线程对同一个缓存同时进行处理的情况,此时,两个线程就会产生冲突,导致一系列问题。所以,本公开实施例通过对所有缓存进行等级划分,并规定各个线程与不同等级的缓存的对应处理关系,从而有效避免了上述冲突情况的发生,优化了工作流程。Because when multiple threads perform data elimination processing on all caches, if the matching relationship between the threads and the cache is not limited, two threads may simultaneously process the same cache. At this time, the two threads will Conflicts lead to a series of problems. Therefore, the embodiment of the present disclosure optimizes the workflow by classifying all caches and specifying the corresponding processing relationship between each thread and a different level of cache, thereby effectively avoiding the occurrence of the above conflict situation.
具体地,扫描模块320利用各个线程池中的多个线程,分别扫描各个缓存,根据扫描结果和等级划分规则为每一个被扫描过的缓存确定缓存等级,用于后续具有针对性的处理。Specifically, the scanning module 320 scans each cache by using multiple threads in each thread pool, and determines a cache level for each scanned cache according to the scan result and the level division rule, for subsequent targeted processing.
淘汰模块330,用于利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。The eliminating module 330 is configured to use multiple threads in each thread pool to eliminate data in a cache whose cache level matches the thread pool.
具体地,根据扫描模块320确定的缓存等级,淘汰模块330利用与各个缓存等级相匹配的线程池中的多个线程对具有对应缓存等级的缓存进行数据淘汰处理。对于数据淘汰处理的具体方法,本公开实施例三对此不作具体限 定,本领域技术人员可以根据实际情况灵活设定。Specifically, according to the cache level determined by the scanning module 320, the culling module 330 performs data elimination processing on the cache having the corresponding cache level by using a plurality of threads in the thread pool matching the respective cache levels. For the specific method of the data elimination processing, the third embodiment of the present disclosure does not specifically limit this. Those skilled in the art can flexibly set according to actual conditions.
关于上述各个模块的具体结构和工作原理可参照方法实施例中相应部分的描述,此处不再赘述。For the specific structure and working principle of each module mentioned above, reference may be made to the description of corresponding parts in the method embodiments, and details are not described herein again.
由此可见,本公开实施例提供的一种基于多个缓存的数据淘汰装置可以按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池;利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级;利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。由此可见,通过将缓存划分为多个缓存等级,并分别针对各个缓存等级创建对应的线程池,能够更好地根据缓存等级调整线程池中的线程数量;并且,通过多个线程池并行处理的方式大大提高了数据淘汰处理效率。It can be seen that the data culling device based on multiple caches according to the embodiment of the present disclosure can divide multiple cache levels according to a preset level division rule, and respectively create matching thread pools for each cache level; The plurality of threads respectively scan each cache, determine the cache level of each cache according to the scan result and the level division rule; and use multiple threads in each thread pool to eliminate the data in the cache whose cache level matches the thread pool. It can be seen that by dividing the cache into multiple cache levels and respectively creating corresponding thread pools for each cache level, the number of threads in the thread pool can be better adjusted according to the cache level; and, through multiple thread pools, parallel processing The way to greatly improve the efficiency of data elimination processing.
实施例四Embodiment 4
图4示出了本公开实施例四提供的一种基于多个缓存的数据淘汰装置的结构示意图,如图所示,该装置包括:划分模块410、权重模块420、扫描模块430和淘汰模块440。FIG. 4 is a schematic structural diagram of a data eliminator device based on multiple caches according to Embodiment 4 of the present disclosure. As shown, the device includes: a partitioning module 410, a weighting module 420, a scanning module 430, and a culling module 440. .
划分模块410,用于按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池。The dividing module 410 is configured to divide a plurality of cache levels according to a preset level dividing rule, and respectively create matching thread pools for each cache level.
其中,预设的等级划分规则用于将各个缓存根据其不同的使用情况来划分成不同等级,同一等级内的各个缓存具有相近的使用情况。在本公开实施例中,该等级划分规则包括:按照缓存的剩余存储空间与总存储空间之间的比值划分缓存等级,其中,剩余存储空间与总存储空间之间的比值越大,缓存等级越高;剩余存储空间与总存储空间之间的比值越小,缓存等级越低。例如,假设缓存等级分为三级,分别为高(HIGH)级别、低(LOW)级别和空闲(IDLE)级别,其中,将缓存的剩余存储空间与总存储空间之间的比值在60%以上的缓存确定为HIGH级别;将缓存的剩余存储空间与总存储空间之间的比值在30%与60%之间的缓存确定为LOW级别;将缓存的剩余存储空间与总存储空间之间的比值在30%以下的缓存确定为IDLE级别。The preset level division rule is used to divide each cache into different levels according to different usage conditions, and each cache in the same level has similar usage. In the embodiment of the present disclosure, the level dividing rule includes: dividing a cache level according to a ratio between a remaining storage space of the cache and a total storage space, wherein a ratio between the remaining storage space and the total storage space is larger, and the cache level is higher. High; the smaller the ratio between the remaining storage space and the total storage space, the lower the cache level. For example, suppose the cache level is divided into three levels, which are high (HIGH) level, low (LOW) level, and idle (IDLE) level, where the ratio of the remaining storage space of the cache to the total storage space is above 60%. The cache is determined to be the HIGH level; the cache between the remaining storage space of the cache and the total storage space is determined to be LOW level between 30% and 60%; the ratio of the remaining storage space of the cache to the total storage space The cache below 30% is determined to be the IDLE level.
为了提高数据淘汰的处理效率,划分模块410分别为各个缓存等级创建匹配的线程池,每个线程池中都包含多个线程。每个线程池中的多个线程均用于对应等级的缓存的数据淘汰处理。因为不同等级的缓存的使用情况不一 样,为了尽可能地优化资源配置,所以不同等级对应的线程池中的线程个数也不相同。In order to improve the processing efficiency of data elimination, the partitioning module 410 creates matching thread pools for each cache level, and each thread pool contains multiple threads. Multiple threads in each thread pool are used for data elimination processing of the corresponding level of cache. Because different levels of cache usage is different In order to optimize resource allocation as much as possible, the number of threads in the thread pool corresponding to different levels is also different.
权重模块420,用于为各个线程池分别设置对应的权重值,根据各个线程池的权重值设置各个线程池内包含的线程的数量。The weight module 420 is configured to separately set a corresponding weight value for each thread pool, and set the number of threads included in each thread pool according to the weight value of each thread pool.
对于权重值的具体设定方法,可以是针对每个线程池,权重模块420根据与该线程池匹配的缓存等级的高低设置该线程池对应的权重值,其中,与该线程池匹配的缓存等级越高,该线程池的权重值越大;相反的,与该线程池匹配的缓存等级越低,该线程池的权重值也就越小。其中,线程池的权重值越大,线程池内包含的线程的数量越多;线程池的权重值越小,线程池内包含的线程的数量就越少。因此,每个线程池内包含的线程数量都是动态变化的。For the specific setting method of the weight value, for each thread pool, the weight module 420 sets the weight value corresponding to the thread pool according to the level of the cache level matched with the thread pool, wherein the cache level matching the thread pool The higher the value of the thread pool, the greater the weight value; conversely, the lower the cache level that matches the thread pool, the smaller the weight value of the thread pool. The greater the weight value of the thread pool, the greater the number of threads included in the thread pool; the smaller the weight value of the thread pool, the fewer the number of threads contained in the thread pool. Therefore, the number of threads contained in each thread pool is dynamically changing.
扫描模块430,用于利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级。The scanning module 430 is configured to separately scan each cache by using multiple threads in each thread pool, and determine a cache level of each cache according to the scan result and the level division rule.
因为当多个线程对所有的缓存进行数据淘汰处理时,如果不对线程与缓存的匹配关系进行限定,就会出现两个线程对同一个缓存进行处理的情况,此时,两个线程就会产生冲突,导致一系列问题。所以,本公开实施例通过对所有缓存进行等级划分,并规定各个线程与不同等级的缓存的对应处理关系,从而有效避免了上述冲突情况的发生,优化了工作流程。Because when multiple threads perform data elimination processing on all caches, if the matching relationship between the threads and the cache is not limited, two threads will process the same cache. At this time, two threads will be generated. Conflicts lead to a series of problems. Therefore, the embodiment of the present disclosure optimizes the workflow by classifying all caches and specifying the corresponding processing relationship between each thread and a different level of cache, thereby effectively avoiding the occurrence of the above conflict situation.
具体地,扫描模块430利用各个线程池中的多个线程,分别扫描各个缓存,根据扫描结果和等级划分规则为每一个被扫描过的缓存确定缓存等级,用于后续具有针对性的处理。Specifically, the scanning module 430 scans each cache by using multiple threads in each thread pool, and determines a cache level for each scanned cache according to the scan result and the level division rule, for subsequent targeted processing.
相应地,对于线程池的权重值的设定方法还可以包括:定期获取各个线程池的扫描结果,根据扫描结果确定各个缓存等级对应的缓存数量;然后根据各个缓存等级对应的缓存数量调整各个线程池的权重值,并根据各个线程池调整后的权重值调整各个线程池内包含的线程的数量。其中,缓存等级对应的缓存数量越多,与该缓存等级匹配的线程池的权重值越大;相反的,缓存等级对应的缓存数量越少,与该缓存等级匹配的线程池的权重值越小。通过缓存数量来确定线程池的权重值,从而决定每个线程池中包含的线程数量,可以使得每个线程池中的线程数量能够准确地满足对应缓存等级中各个缓存的处理操作,使资源得到合理使用,节省成本。 Correspondingly, the setting method of the weight value of the thread pool may further include: periodically acquiring the scan result of each thread pool, determining the number of caches corresponding to each cache level according to the scan result; and then adjusting each thread according to the number of caches corresponding to each cache level. The weight value of the pool, and adjust the number of threads included in each thread pool according to the adjusted weight value of each thread pool. The greater the number of caches corresponding to the cache level, the greater the weight of the thread pool matching the cache level. Conversely, the smaller the cache level corresponding to the cache level, the smaller the weight value of the thread pool matching the cache level. . Determining the weight value of the thread pool by the number of caches, thereby determining the number of threads included in each thread pool, so that the number of threads in each thread pool can accurately satisfy the processing operations of each cache in the corresponding cache level, so that the resources are obtained. Reasonable use and cost saving.
在其他实施例中,还可以综合采用权重模块420和扫描模块430中提供的线程池的权重值设定方法,从而设置更加合理的线程池的权重值。另外,线程池的权重值还可以进一步根据对应等级的缓存的类型、重要程度等多种因素进行确定。In other embodiments, the weight value setting method of the thread pool provided in the weight module 420 and the scanning module 430 may be comprehensively combined to set a more reasonable weight value of the thread pool. In addition, the weight value of the thread pool can be further determined according to various factors such as the type and importance of the cache of the corresponding level.
淘汰模块440:利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。The elimination module 440: utilizes multiple threads in each thread pool to eliminate data in the cache whose cache level matches the thread pool.
具体地,根据上述模块确定的缓存等级,淘汰模块440利用与各个缓存等级相匹配的线程池中的多个线程对具有对应缓存等级的缓存进行数据淘汰处理。其中,每个线程池可以仅处理一个缓存等级的缓存,例如,划分模块410中将缓存分为HIGH级别、LOW级别和IDLE级别一共三个级别,所以仅需三个线程池与之对应。具体的,线程池1对应HIGH级别,线程池2对应LOW级别,线程池3对应IDLE级别,在这种情况下,线程池1中的所有线程仅处理HIGH级别中的所有缓存,线程池2中所有线程仅处理LOW级别中的所有缓存,线程池3中所有线程仅处理IDLE级别中的所有缓存。当然,当缓存等级较多时,每个线程池也可以用于处理多个缓存等级的缓存。例如,当缓存等级包括六个级别时,也可以由三个线程池进行处理,每个线程池分别处理两个等级的缓存。Specifically, according to the cache level determined by the above module, the elimination module 440 performs data elimination processing on the cache with the corresponding cache level by using multiple threads in the thread pool matching the respective cache levels. Each thread pool can process only one cache level cache. For example, the partition module 410 divides the cache into three levels of a HIGH level, a LOW level, and an IDLE level, so only three thread pools are required. Specifically, the thread pool 1 corresponds to the HIGH level, the thread pool 2 corresponds to the LOW level, and the thread pool 3 corresponds to the IDLE level. In this case, all the threads in the thread pool 1 only process all the caches in the HIGH level, and the thread pool 2 All threads only process all caches in the LOW level, and all threads in thread pool 3 only process all caches in the IDLE level. Of course, when there are more cache levels, each thread pool can also be used to handle multiple cache level caches. For example, when the cache level includes six levels, it can also be processed by three thread pools, each of which handles two levels of cache.
总之,通过缓存等级的划分以及线程池技术的应用,能够更加灵活地实现缓存的扫描及数据淘汰工作。另外,上述的扫描模块430以及淘汰模块440均可以反复多次运行,例如,扫描模块430可以每隔预设的第一时间间隔运行一次,淘汰模块440可以每隔预设的第二时间间隔运行一次。其中,第一时间间隔与第二时间间隔可以相等,也可以不等。另外,第一时间间隔和第二时间间隔既可以是固定值,也可以是动态变化的数值。例如,第一时间间隔可以根据扫描结果进行动态调整:当扫描结果中HIGH级别的缓存数量较多时,缩小第一时间间隔;当扫描结果中HIGH级别的缓存数量较少时,增大第一时间间隔。另外,在淘汰模块440的每次运行过程中,各个线程池既可以按照相同的执行周期对相应等级的缓存执行淘汰操作,也可以按照不同的执行周期对相应等级的缓存执行淘汰操作。例如,用于处理HIGH级别的缓存的线程池可以按照较短的执行周期进行数据淘汰操作,以防止HIGH级别的缓存的可用空间不足;用于处理IDLE级别的缓存的线程池可以按照较 长的执行周期进行数据淘汰操作,以节省系统开销。总之,本领域技术人员可根据实际需要灵活采用各种方式确定上述的扫描模块430以及淘汰模块440的运行次数以及运行时机,本公开对此不做限定。由此可见,通过缓存等级的划分以及线程池技术的应用,为数据淘汰操作提供了更多的灵活性和可控性,能够满足各类场景的需求。In short, through the division of the cache level and the application of the thread pool technology, the cache scanning and data elimination work can be more flexibly implemented. In addition, the scanning module 430 and the eliminating module 440 can be repeatedly operated multiple times. For example, the scanning module 430 can be run once every preset first time interval, and the eliminating module 440 can be operated every preset second time interval. once. The first time interval and the second time interval may be equal or may not be equal. In addition, the first time interval and the second time interval may be either fixed values or dynamically changing values. For example, the first time interval may be dynamically adjusted according to the scan result: when the number of caches of the HIGH level is large in the scan result, the first time interval is reduced; when the number of caches of the HIGH level in the scan result is small, the first time is increased. interval. In addition, during each operation of the elimination module 440, each thread pool can perform the elimination operation on the cache of the corresponding level according to the same execution cycle, or perform the elimination operation on the cache of the corresponding level according to different execution cycles. For example, a thread pool for handling HIGH-level caches can perform data elimination operations with a shorter execution cycle to prevent insufficient free space for the HIGH-level cache; the thread pool for handling IDLE-level caches can be compared A long execution cycle performs data elimination operations to save system overhead. In summary, the number of times of running and the running time of the scanning module 430 and the eliminating module 440 can be determined in a variety of manners by a person skilled in the art according to actual needs, which is not limited by the disclosure. It can be seen that the division of the cache level and the application of the thread pool technology provide more flexibility and controllability for the data elimination operation, and can meet the needs of various scenarios.
在本公开实施例中,淘汰模块440进行数据淘汰的具体方法可以由本领域技术人员灵活设置,本公开对此不做限定。例如,可以根据数据写入时间、数据写入次数、数据温度属性、数据类型等多种因素进行淘汰。在本实施例中,数据淘汰方法可以是:根据缓存内的各个数据的写入总次数以及预设的温度属性计算规则,计算缓存内的各个数据的温度属性值,并根据温度属性值确定缓存内的各个数据的淘汰顺序。In the embodiment of the present disclosure, the specific method for the data elimination by the eliminating module 440 can be flexibly set by a person skilled in the art, which is not limited by the disclosure. For example, it can be eliminated based on various factors such as data write time, number of data writes, data temperature attributes, and data types. In this embodiment, the data elimination method may be: calculating a temperature attribute value of each data in the cache according to a total number of writes of each data in the cache and a preset temperature attribute calculation rule, and determining a cache according to the temperature attribute value. The order of elimination of each data within.
其中,预设的温度属性计算规则为本领域技术人员根据实际情况所设置的计算各个缓存数据的热门程度的规则。在这里,缓存数据的热门程度可以通过缓存数据被写入的总次数、和/或缓存数据的存储时段等因素进行确定。具体地,在计算各个缓存数据的温度属性值时,可以单独根据各个缓存数据的写入总次数计算各个缓存数据的温度属性值;也可以进一步结合其他因素计算各个缓存数据的温度属性值。本公开对温度属性值的具体计算规则不做限定,只要能够满足用户的实际需求即可。The preset temperature attribute calculation rule is a rule for calculating the popularity degree of each cached data set by a person skilled in the art according to actual conditions. Here, the popularity of the cached data can be determined by factors such as the total number of times the cached data is written, and/or the storage period of the cached data. Specifically, when calculating the temperature attribute value of each cached data, the temperature attribute value of each cached data may be separately calculated according to the total number of writes of each cached data; the temperature attribute value of each cached data may be further calculated in combination with other factors. The present disclosure does not limit the specific calculation rule of the temperature attribute value, as long as it can meet the actual needs of the user.
在计算出各个缓存数据的温度属性值之后,按照上述计算的温度属性值从低到高的顺序,依次淘汰温度属性值最低的缓存数据,以此实现根据缓存数据热门程度来淘汰数据的效果,并且及时有效地释放缓存空间。After calculating the temperature attribute values of the respective cache data, the cached data having the lowest temperature attribute value is sequentially eliminated according to the calculated temperature attribute values from low to high, thereby achieving the effect of eliminating the data according to the popularity of the cached data. And release the cache space in a timely and effective manner.
另外,本领域技术人员还可以对上述方案进行各种改动和变形。例如,在根据写入总次数确定温度属性时,除了直接根据写入总次数的数值进行确定外,还可以预先将写入总次数划分为多个数值区间,为各个数值区间分别设置对应的区间分值,并根据该区间分值确定温度属性值。例如,当写入总次数属于【0,10】这一数值区间时,区间分值为1;当写入总次数属于【10,50】这一数值区间时,区间分值为5;当写入总次数属于【50,100】这一数值区间时,区间分值为10。通过区间分值能够更加灵活地将写入总次数位于某一区间内的数据确定为热门数据。而且,为了使数据淘汰方式更为灵活,上述预设的温度属性计算规则还可以包括:预先将缓存对应的缓存时长进一 步划分为多个缓存时段,为各个缓存时段分别设置对应的时段权重值;针对每个缓存数据,根据该缓存数据各次写入时对应的缓存时段的时段权重值确定该缓存数据的温度属性值。缓存时长可以为:由缓存中写入时间最早的数据所对应的第一数据写入时间和写入时间最晚的数据所对应的第二数据写入时间所界定的时间长度。而且,缓存时长也可以是预先设定的时间长度,例如,假设一个缓存专用于存储最近的三小时之内的缓存数据,一旦写入缓存的缓存数据的写入时间超过三小时则会自动删除,则该缓存的缓存时长为3小时。在将缓存时长划分为多个缓存时段时,可以将整个缓存时长划分为多个均等的缓存时段,也可以将整个缓存时长划分为多个不等的缓存时段。为了便于根据缓存时段计算缓存数据的温度属性,在进行上述划分之后,可选地,还可以分别针对每个缓存时段设置与该缓存时段对应的时段数据表,其中,各个时段数据表用于记录相应的缓存时段内写入的缓存数据。为了便于根据缓存时段确定缓存数据的淘汰顺序,在本实施例中,还需要为上述划分的各个缓存时段设置对应的时段权重值,其设置方式也是多样的。具体地,可以将各个时段的权重值设置为均等的,这样更侧重于从各个缓存数据出现的次数这一方面去计算缓存数据的温度属性值;或者,也可以按照缓存时段在时间上从前往后的顺序对应地设置递增的(或递减的)时段权重值,这样侧重于将缓存数据的出现次数与写入时间进行结合来计算各个缓存的数据的温度属性值。在这里,各个时段的权重值设置由本领域技术人员根据实际情况而定,本公开对此不作限制。总之,通过缓存时段以及时段权重值的设置方式,使用户能够根据实际需求优先淘汰非重要时段内的数据,使淘汰方案更为灵活。In addition, various modifications and changes may be made to the above described embodiments. For example, when the temperature attribute is determined according to the total number of times of writing, in addition to directly determining the value according to the total number of times of writing, the total number of times of writing may be divided into a plurality of numerical intervals in advance, and corresponding intervals are respectively set for each numerical interval. The score is determined and the temperature attribute value is determined based on the interval score. For example, when the total number of writes belongs to the value range [0, 10], the interval score is 1; when the total number of writes belongs to the value range [10, 50], the interval score is 5; When the total number of entries is in the range of [50,100], the interval score is 10. The interval score can be used to more flexibly determine the data whose total number of writes is within a certain interval as hot data. Moreover, in order to make the data elimination mode more flexible, the foregoing preset temperature attribute calculation rule may further include: pre-setting the cache duration corresponding to the cache into one Steps are divided into a plurality of buffer periods, and corresponding period weight values are respectively set for each buffer period; for each cache data, a temperature attribute of the cache data is determined according to a period weight value of the corresponding buffer period when the cache data is written each time value. The buffer duration may be: a length of time defined by a first data write time corresponding to the data with the oldest write time in the cache and a second data write time corresponding to the latest data of the write time. Moreover, the buffer duration can also be a preset length of time. For example, suppose a cache is dedicated to storing cached data within the last three hours, and is automatically deleted once the write cached data is written for more than three hours. , the cache cache time is 3 hours. When the cache duration is divided into multiple cache periods, the entire cache duration may be divided into multiple equal cache periods, or the entire cache duration may be divided into multiple cache periods. In order to facilitate the calculation of the temperature attribute of the cache data according to the buffer period, after performing the above division, optionally, a period data table corresponding to the buffer period may be separately set for each buffer period, wherein each period data table is used for recording. The cached data written during the corresponding cache period. In this embodiment, it is also required to set a corresponding time slot weight value for each of the divided buffer periods in the present embodiment, and the setting manner thereof is also diverse. Specifically, the weight values of the respective time periods may be set to be equal, so that the temperature attribute value of the cached data is calculated from the aspect of the number of occurrences of each cached data; or, the time period may be followed according to the cache time period. The subsequent order correspondingly sets an incremented (or decremented) time slot weight value, such that the focus is on combining the number of occurrences of the cached data with the write time to calculate the temperature attribute value of each cached data. Here, the weight value setting of each time period is determined by a person skilled in the art according to actual conditions, and the disclosure does not limit this. In short, the setting of the buffer period and the time weight value enables the user to prioritize the elimination of data in non-critical time periods according to actual needs, so that the elimination scheme is more flexible.
另外,上述预设的等级划分规则除了可以按照存储空间进行划分外,还可以进一步根据缓存内存储的数据类型等其他因素进行划分,总之,本公开对缓存等级的划分方式以及线程池的权重设置方式不做限定。In addition, the foregoing preset level division rules may be further divided according to other types of data stored in the cache, in addition to the storage space, and in summary, the manner of dividing the cache level and the weight setting of the thread pool in the present disclosure. The method is not limited.
在本公开实施例中,上述各个线程池之间均相互并行运行,由此可以使数据处理效率进一步提高。In the embodiment of the present disclosure, each of the above thread pools runs in parallel with each other, thereby further improving data processing efficiency.
关于上述各个模块的具体结构和工作原理可参照方法实施例中相应部分的描述,此处不再赘述。For the specific structure and working principle of each module mentioned above, reference may be made to the description of corresponding parts in the method embodiments, and details are not described herein again.
由此可见,本公开实施例四提供的一种基于多个缓存的数据淘汰装置可 以按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池;利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及等级划分规则确定各个缓存的缓存等级;利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。由此解决了现有技术中单线程处理效率低下的问题,实现了在不同权重的缓存集合上多线程并行同时进行数据淘汰操作,在保证一致性的同时大大提高了数据淘汰处理效率,同时,通过对线程池的改进,可以在并行的同时保证缓存集合的淘汰优先级。It can be seen that the data elimination device based on multiple caches provided by Embodiment 4 of the present disclosure may be Dividing a plurality of cache levels according to a preset level division rule, respectively creating a matching thread pool for each cache level; scanning each cache separately by using multiple threads in each thread pool, and determining each cache according to the scan result and the level division rule Cache level; utilizes multiple threads in each thread pool to eliminate data in the cache with a cache level that matches the thread pool. Therefore, the problem of low efficiency of single-thread processing in the prior art is solved, and multi-thread parallel operation of data elimination operations on different weighted cache sets is realized, which ensures the consistency of data elimination and greatly improves the efficiency of data elimination processing. Through the improvement of the thread pool, the elimination priority of the cache set can be guaranteed in parallel.
图5示意性地示出了用于执行根据本公开实施例的基于多个缓存的数据淘汰方法的计算设备的框图。该计算设备传统上包括处理器510和以存储设备520形式的计算机程序产品或者计算机可读介质。存储设备520可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储设备520具有存储用于执行上述方法中的任何方法步骤的程序代码531的存储空间530。例如,存储程序代码的存储空间530可以包括分别用于实现上面的方法中的各种步骤的各个程序代码531。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘、紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为例如图6所示的便携式或者固定存储单元。该存储单元可以具有与图5的计算设备中的存储设备520类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括用于执行根据本公开的方法步骤的计算机可读代码531',即可以由诸如510之类的处理器读取的代码,当这些代码由计算设备运行时,导致该计算设备执行上面所描述的方法中的各个步骤。FIG. 5 schematically illustrates a block diagram of a computing device for performing a multiple cache based data retirement method in accordance with an embodiment of the present disclosure. The computing device conventionally includes a processor 510 and a computer program product or computer readable medium in the form of a storage device 520. Storage device 520 can be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM. Storage device 520 has a storage space 530 that stores program code 531 for performing any of the method steps described above. For example, storage space 530 storing program code may include various program code 531 for implementing various steps in the above methods, respectively. The program code can be read from or written to one or more computer program products. These computer program products include program code carriers such as a hard disk, a compact disk (CD), a memory card, or a floppy disk. Such a computer program product is typically a portable or fixed storage unit such as that shown in FIG. The storage unit may have storage segments, storage spaces, and the like that are similarly arranged to storage device 520 in the computing device of FIG. The program code can be compressed, for example, in an appropriate form. Typically, the storage unit includes computer readable code 531' for performing the steps of the method in accordance with the present disclosure, ie, code that can be read by a processor, such as 510, which when executed by the computing device causes the computing device Perform the various steps in the method described above.
在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本公开也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本公开的内容,并且上面对特定语言所做的描述是为了披露本公开的最佳实施方式。The algorithms and displays provided herein are not inherently related to any particular computer, virtual system, or other device. Various general purpose systems can also be used with the teaching based on the teachings herein. The structure required to construct such a system is apparent from the above description. Moreover, the present disclosure is not directed to any particular programming language. It is to be understood that the subject matter of the present disclosure, which is described herein, may be described in a particular language.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本 公开的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. However, can understand, this The disclosed embodiments may be practiced without these specific details. In some instances, well-known methods, structures, and techniques are not shown in detail so as not to obscure the understanding of the description.
类似地,应当理解,为了精简本公开并帮助理解各个公开方面中的一个或多个,在上面对本公开的示例性实施例的描述中,本公开的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本公开要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,公开方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本公开的单独实施例。In the description of the exemplary embodiments of the present disclosure, the various features of the present disclosure are sometimes grouped together into a single embodiment, Figure, or a description of it. However, the method disclosed is not to be interpreted as reflecting the intention that the claimed invention requires more features than those recited in the claims. Rather, as disclosed in the following claims, the disclosed aspects are less than all features of the single embodiments disclosed herein. Therefore, the claims following the specific embodiments are hereby explicitly incorporated into the specific embodiments, and each of the claims as a separate embodiment of the present disclosure.
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art will appreciate that the modules in the devices of the embodiments can be adaptively changed and placed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and further they may be divided into a plurality of sub-modules or sub-units or sub-components. In addition to such features and/or at least some of the processes or units being mutually exclusive, any combination of the features disclosed in the specification, including the accompanying claims, the abstract and the drawings, and any methods so disclosed, or All processes or units of the device are combined. Each feature disclosed in this specification (including the accompanying claims, the abstract and the drawings) may be replaced by alternative features that provide the same, equivalent or similar purpose.
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本公开的范围之内并且形成不同的实施例。例如,在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。In addition, those skilled in the art will appreciate that, although some embodiments described herein include certain features that are included in other embodiments and not in other features, combinations of features of different embodiments are intended to be within the scope of the present disclosure. Different embodiments are formed and formed. For example, in the following claims, any one of the claimed embodiments can be used in any combination.
本公开的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本公开实施例的一种基于多个缓存的数据淘汰装置中的一些或者全部部件的一些或者全部功能。本公开还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。 这样的实现本公开的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。Various component embodiments of the present disclosure may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or digital signal processor (DSP) may be used in practice to implement some of some or all of the components of a plurality of cache-based data elimining devices in accordance with embodiments of the present disclosure. Or all features. The present disclosure may also be implemented as a device or device program (eg, a computer program and a computer program product) for performing some or all of the methods described herein. Such a program implementing the present disclosure may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
应该注意的是上述实施例对本公开进行说明而不是对本公开进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本公开可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。 It should be noted that the above-described embodiments are illustrative of the present disclosure and are not intended to limit the scope of the disclosure, and those skilled in the art can devise alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as a limitation. The word "comprising" does not exclude the presence of the elements or steps that are not recited in the claims. The word "a" or "an" The present disclosure can be implemented by means of hardware comprising several distinct elements and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means can be embodied by the same hardware item. The use of the words first, second, and third does not indicate any order. These words can be interpreted as names.

Claims (16)

  1. 一种基于多个缓存的数据淘汰方法,包括:A data culling method based on multiple caches, including:
    按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池;其中,每个线程池中包含多个线程;Dividing a plurality of cache levels according to a preset level division rule, respectively creating a matching thread pool for each cache level; wherein each thread pool includes multiple threads;
    利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及所述等级划分规则确定各个缓存的缓存等级;Each cache is scanned by multiple threads in each thread pool, and the cache level of each cache is determined according to the scan result and the level division rule;
    利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。The data in the cache with the cache level matching the thread pool is eliminated by using multiple threads in each thread pool.
  2. 根据权利要求1所述的方法,其中,进一步包括:为各个线程池分别设置对应的权重值,根据各个线程池的权重值设置各个线程池内包含的线程的数量;其中,线程池的权重值越大,线程池内包含的线程的数量越多。The method of claim 1, further comprising: setting a corresponding weight value for each thread pool, and setting a number of threads included in each thread pool according to a weight value of each thread pool; wherein, the weight value of the thread pool is higher Large, the number of threads contained in the thread pool is larger.
  3. 根据权利要求2所述的方法,其中,所述为各个线程池分别设置对应的权重值,根据各个线程池的权重值设置各个线程池内包含的线程的数量的步骤具体包括:The method of claim 2, wherein the step of setting a corresponding weight value for each thread pool, and setting the number of threads included in each thread pool according to the weight value of each thread pool comprises:
    定期获取各个线程池的扫描结果,根据所述扫描结果确定各个缓存等级对应的缓存数量;Regularly obtaining scan results of each thread pool, and determining the number of caches corresponding to each cache level according to the scan result;
    根据所述各个缓存等级对应的缓存数量调整各个线程池的权重值,并根据各个线程池调整后的权重值调整各个线程池内包含的线程的数量;Adjusting the weight values of the respective thread pools according to the number of caches corresponding to the respective cache levels, and adjusting the number of threads included in each thread pool according to the adjusted weight values of the respective thread pools;
    其中,缓存等级对应的缓存数量越多,与该缓存等级匹配的线程池的权重值越大。The more the cache level corresponding to the cache level, the larger the weight value of the thread pool matching the cache level.
  4. 根据权利要求2或3所述的方法,其中,所述为各个线程池分别设置对应的权重值的步骤进一步包括:The method of claim 2 or 3, wherein the step of separately setting a corresponding weight value for each thread pool further comprises:
    针对每个线程池,根据与该线程池匹配的缓存等级的高低设置该线程池对应的权重值;其中,与该线程池匹配的缓存等级越高,该线程池的权重值越大。For each thread pool, the weight value corresponding to the thread pool is set according to the level of the cache level matched with the thread pool; wherein the higher the cache level matching the thread pool, the greater the weight value of the thread pool.
  5. 根据权利要求1-4任一所述的方法,其中,所述预设的等级划分规则 包括:按照缓存的剩余存储空间与总存储空间之间的比值划分缓存等级,其中,剩余存储空间与总存储空间之间的比值越大,缓存等级越高。The method according to any one of claims 1 to 4, wherein the preset ranking rule The method includes: dividing a cache level according to a ratio between a remaining storage space of the cache and a total storage space, wherein a larger ratio between the remaining storage space and the total storage space is, and a cache level is higher.
  6. 根据权利要求1-5任一所述的方法,其中,所述利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰的步骤具体包括:The method according to any one of claims 1-5, wherein the step of using the plurality of threads in each thread pool to eliminate the data in the cache matching the cache level and the thread pool comprises:
    根据所述缓存内的各个数据的写入总次数以及预设的温度属性计算规则,计算所述缓存内的各个数据的温度属性值,并根据所述温度属性值确定所述缓存内的各个数据的淘汰顺序。Calculating a temperature attribute value of each data in the cache according to a total number of writes of each data in the cache and a preset temperature attribute calculation rule, and determining each data in the cache according to the temperature attribute value The order of elimination.
  7. 根据权利要求1-6任一所述的方法,其中,各个线程池之间相互并行运行。The method of any of claims 1-6, wherein the respective thread pools run in parallel with each other.
  8. 一种基于多个缓存的数据淘汰装置,包括:A data culling device based on multiple caches, comprising:
    划分模块,用于按照预设的等级划分规则划分多个缓存等级,分别为各个缓存等级创建匹配的线程池;其中,每个线程池中包含多个线程;a dividing module, configured to divide a plurality of cache levels according to a preset level dividing rule, and respectively create a matching thread pool for each cache level; wherein each thread pool includes multiple threads;
    扫描模块,用于利用各个线程池中的多个线程分别扫描各个缓存,根据扫描结果以及所述等级划分规则确定各个缓存的缓存等级;a scanning module, configured to scan each cache separately by using multiple threads in each thread pool, and determine a cache level of each cache according to the scan result and the level division rule;
    淘汰模块,用于利用每个线程池中的多个线程对缓存等级与该线程池匹配的缓存内的数据进行淘汰。The culling module is configured to use multiple threads in each thread pool to eliminate data in a cache whose cache level matches the thread pool.
  9. 根据权利要求8所述的装置,其中,进一步包括:权重模块,用于为各个线程池分别设置对应的权重值,根据各个线程池的权重值设置各个线程池内包含的线程的数量;其中,线程池的权重值越大,线程池内包含的线程的数量越多。The device according to claim 8, further comprising: a weighting module, configured to respectively set a corresponding weight value for each thread pool, and set a number of threads included in each thread pool according to a weight value of each thread pool; wherein, the thread The larger the weight value of the pool, the greater the number of threads contained in the thread pool.
  10. 根据权利要求9所述的装置,其中,所述权重模块具体用于:The apparatus according to claim 9, wherein the weight module is specifically configured to:
    定期获取各个线程池的扫描结果,根据所述扫描结果确定各个缓存等级对应的缓存数量;Regularly obtaining scan results of each thread pool, and determining the number of caches corresponding to each cache level according to the scan result;
    根据所述各个缓存等级对应的缓存数量调整各个线程池的权重值,并根据各个线程池调整后的权重值调整各个线程池内包含的线程的数量;Adjusting the weight values of the respective thread pools according to the number of caches corresponding to the respective cache levels, and adjusting the number of threads included in each thread pool according to the adjusted weight values of the respective thread pools;
    其中,缓存等级对应的缓存数量越多,与该缓存等级匹配的线程池的权重值越大。 The more the cache level corresponding to the cache level, the larger the weight value of the thread pool matching the cache level.
  11. 根据权利要求9或10所述的装置,其中,所述权重模块进一步用于:The apparatus of claim 9 or 10, wherein the weighting module is further configured to:
    针对每个线程池,根据与该线程池匹配的缓存等级的高低设置该线程池对应的权重值;其中,与该线程池匹配的缓存等级越高,该线程池的权重值越大。For each thread pool, the weight value corresponding to the thread pool is set according to the level of the cache level matched with the thread pool; wherein the higher the cache level matching the thread pool, the greater the weight value of the thread pool.
  12. 根据权利要求8-11任一所述的装置,其中,所述预设的等级划分规则包括:按照缓存的剩余存储空间与总存储空间之间的比值划分缓存等级,其中,剩余存储空间与总存储空间之间的比值越大,缓存等级越高。The device according to any one of claims 8-11, wherein the preset level dividing rule comprises: dividing a cache level according to a ratio between a cached remaining storage space and a total storage space, wherein the remaining storage space and the total The larger the ratio between storage spaces, the higher the cache level.
  13. 根据权利要求8-12任一所述的装置,其中,所述淘汰模块具体用于:The apparatus according to any one of claims 8 to 12, wherein the elimination module is specifically configured to:
    根据所述缓存内的各个数据的写入总次数以及预设的温度属性计算规则,计算所述缓存内的各个数据的温度属性值,并根据所述温度属性值确定所述缓存内的各个数据的淘汰顺序。Calculating a temperature attribute value of each data in the cache according to a total number of writes of each data in the cache and a preset temperature attribute calculation rule, and determining each data in the cache according to the temperature attribute value The order of elimination.
  14. 根据权利要求8-13任一所述的装置,其中,各个线程池之间相互并行运行。The apparatus of any of claims 8-13, wherein the respective thread pools run in parallel with each other.
  15. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在计算设备上运行时,导致所述计算设备执行根据权利要求1-7中的任一项所述的基于多个缓存的数据淘汰方法。A computer program comprising computer readable code, when the computer readable code is run on a computing device, causing the computing device to perform a plurality of cache-based according to any one of claims 1-7 Data elimination method.
  16. 一种计算机可读介质,其中存储了如权利要求15所述的计算机程序。 A computer readable medium storing the computer program of claim 15.
PCT/CN2017/115616 2016-12-29 2017-12-12 Multiple buffer-based data elimination method and device WO2018121242A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611246005.8A CN106649139B (en) 2016-12-29 2016-12-29 Data elimination method and device based on multiple caches
CN201611246005.8 2016-12-29

Publications (1)

Publication Number Publication Date
WO2018121242A1 true WO2018121242A1 (en) 2018-07-05

Family

ID=58836170

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/115616 WO2018121242A1 (en) 2016-12-29 2017-12-12 Multiple buffer-based data elimination method and device

Country Status (2)

Country Link
CN (1) CN106649139B (en)
WO (1) WO2018121242A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795632A (en) * 2019-10-30 2020-02-14 北京达佳互联信息技术有限公司 State query method and device and electronic equipment

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649139B (en) * 2016-12-29 2020-01-10 北京奇虎科技有限公司 Data elimination method and device based on multiple caches
CN107301215B (en) * 2017-06-09 2020-12-18 北京奇艺世纪科技有限公司 Search result caching method and device and search method and device
CN107608911B (en) * 2017-09-12 2020-09-22 苏州浪潮智能科技有限公司 Cache data flashing method, device, equipment and storage medium
CN111078585B (en) * 2019-11-29 2022-03-29 智器云南京信息科技有限公司 Memory cache management method, system, storage medium and electronic equipment
CN111552652B (en) * 2020-07-13 2020-11-17 深圳鲲云信息科技有限公司 Data processing method and device based on artificial intelligence chip and storage medium
CN115729767A (en) * 2021-08-30 2023-03-03 华为技术有限公司 Temperature detection method and device for memory

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7209437B1 (en) * 1998-10-15 2007-04-24 British Telecommunications Public Limited Company Computer communication providing quality of service
CN101561783A (en) * 2008-04-14 2009-10-21 阿里巴巴集团控股有限公司 Method and device for Cache asynchronous elimination
CN102541460A (en) * 2010-12-20 2012-07-04 中国移动通信集团公司 Multiple disc management method and equipment
CN103345452A (en) * 2013-07-18 2013-10-09 四川九成信息技术有限公司 Data caching method in multiple buffer storages according to weight information
CN105404595A (en) * 2014-09-10 2016-03-16 阿里巴巴集团控股有限公司 Cache management method and apparatus
CN106649139A (en) * 2016-12-29 2017-05-10 北京奇虎科技有限公司 Data eliminating method and device based on multiple caches

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6349363B2 (en) * 1998-12-08 2002-02-19 Intel Corporation Multi-section cache with different attributes for each section
US6990557B2 (en) * 2002-06-04 2006-01-24 Sandbridge Technologies, Inc. Method and apparatus for multithreaded cache with cache eviction based on thread identifier
US7200713B2 (en) * 2004-03-29 2007-04-03 Intel Corporation Method of implementing off-chip cache memory in dual-use SRAM memory for network processors
CN101609432B (en) * 2009-07-13 2011-04-13 中国科学院计算技术研究所 Shared cache management system and method thereof
CN103279429A (en) * 2013-05-24 2013-09-04 浪潮电子信息产业股份有限公司 Application-aware distributed global shared cache partition method
CN103399856B (en) * 2013-07-01 2017-09-15 北京科东电力控制系统有限责任公司 Towards the explosion type data buffer storage processing system and its method of SCADA system
CN104881492B (en) * 2015-06-12 2018-11-30 北京京东尚科信息技术有限公司 Data filtering method and device based on caching allocation methods

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7209437B1 (en) * 1998-10-15 2007-04-24 British Telecommunications Public Limited Company Computer communication providing quality of service
CN101561783A (en) * 2008-04-14 2009-10-21 阿里巴巴集团控股有限公司 Method and device for Cache asynchronous elimination
CN102541460A (en) * 2010-12-20 2012-07-04 中国移动通信集团公司 Multiple disc management method and equipment
CN103345452A (en) * 2013-07-18 2013-10-09 四川九成信息技术有限公司 Data caching method in multiple buffer storages according to weight information
CN105404595A (en) * 2014-09-10 2016-03-16 阿里巴巴集团控股有限公司 Cache management method and apparatus
CN106649139A (en) * 2016-12-29 2017-05-10 北京奇虎科技有限公司 Data eliminating method and device based on multiple caches

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795632A (en) * 2019-10-30 2020-02-14 北京达佳互联信息技术有限公司 State query method and device and electronic equipment

Also Published As

Publication number Publication date
CN106649139A (en) 2017-05-10
CN106649139B (en) 2020-01-10

Similar Documents

Publication Publication Date Title
WO2018121242A1 (en) Multiple buffer-based data elimination method and device
US7366878B1 (en) Scheduling instructions from multi-thread instruction buffer based on phase boundary qualifying rule for phases of math and data access operations with better caching
US6662272B2 (en) Dynamic cache partitioning
US9569270B2 (en) Mapping thread phases onto heterogeneous cores based on execution characteristics and cache line eviction counts
US9304920B2 (en) System and method for providing cache-aware lightweight producer consumer queues
KR101761301B1 (en) Memory resource optimization method and apparatus
WO2020089759A1 (en) Artificial intelligence-enabled management of storage media access
US20160371014A1 (en) Ordering Memory Commands in a Computer System
US20110066830A1 (en) Cache prefill on thread migration
JP2008547106A (en) Search backoff mechanism
JP2021517697A (en) Resource scheduling method and terminal device
WO2017117734A1 (en) Cache management method, cache controller and computer system
CN107544926B (en) Processing system and memory access method thereof
US20130275649A1 (en) Access Optimization Method for Main Memory Database Based on Page-Coloring
CN108984130A (en) A kind of the caching read method and its device of distributed storage
WO2021232769A1 (en) Method for storing data and data processing apparatus
JPH11328031A (en) Trace ranking method in dynamic conversion system
US9229885B2 (en) Adaptive scheduling queue control for memory controllers based upon page hit distance determinations
US20210191775A1 (en) Resource Management Unit for Capturing Operating System Configuration States and Offloading Tasks
CN105378652A (en) Method and apparatus for allocating thread shared resource
Etsion et al. Exploiting core working sets to filter the L1 cache with random sampling
TWI828307B (en) Computing system for memory management opportunities and memory swapping tasks and method of managing the same
US11556377B2 (en) Storage medium, task execution management device, and task execution management method
US20180329756A1 (en) Distributed processing system, distributed processing method, and storage medium
US8793459B2 (en) Implementing feedback directed NUMA mitigation tuning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17888840

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17888840

Country of ref document: EP

Kind code of ref document: A1