CN110674121B - Cache data cleaning method, device, equipment and computer readable storage medium - Google Patents

Cache data cleaning method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN110674121B
CN110674121B CN201910780198.2A CN201910780198A CN110674121B CN 110674121 B CN110674121 B CN 110674121B CN 201910780198 A CN201910780198 A CN 201910780198A CN 110674121 B CN110674121 B CN 110674121B
Authority
CN
China
Prior art keywords
cache
data
increment
node
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910780198.2A
Other languages
Chinese (zh)
Other versions
CN110674121A (en
Inventor
宋杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910780198.2A priority Critical patent/CN110674121B/en
Priority to PCT/CN2019/118233 priority patent/WO2021031408A1/en
Publication of CN110674121A publication Critical patent/CN110674121A/en
Application granted granted Critical
Publication of CN110674121B publication Critical patent/CN110674121B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a cache data cleaning method, a device, equipment and a computer readable storage medium, which relate to the technical field of Internet, can predict the predicted cache increment of a circulation queue, count the current total data amount, clean target cache nodes with low importance coefficients when the sum of the predicted cache increment and the total data amount is larger than a cleaning threshold value, ensure that the importance of cleaned cache data is the lowest, improve the utilization value of a cache space and have higher user viscosity. The method comprises the following steps: predicting the predicted cache growth of the circulation queue according to the history cache growth of the circulation queue in the history time period; counting the total data amount of the cache data currently stored in the circular queue, and calculating the sum of the total data amount and the predicted cache growth amount; when the sum is greater than or equal to the cleaning threshold value, calculating an importance coefficient of each cache node based on the node position and the node access rate of each cache node in the circular queue; and cleaning the cache data of the target cache node in the circular queue.

Description

Cache data cleaning method, device, equipment and computer readable storage medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a method, an apparatus, a device, and a computer readable storage medium for cleaning cache data.
Background
With the continuous development of internet technology, more and more intelligent terminals enter the daily work and life of people. The user can interact with other users through the intelligent terminal, and can acquire information such as some characters, videos or pictures from the network through the intelligent terminal. When a user performs information interaction with other users through the intelligent terminal, the generated interaction information needs to be stored in the local area of the intelligent terminal, and the data stored in the local area of the intelligent terminal are called cache data. Over time, the user performs more and more information interaction based on the intelligent terminal, so that the data volume of the cache data in the intelligent terminal is larger and larger, and the storage space of the intelligent terminal is limited, so that the cache data stored in the intelligent terminal needs to be cleaned at regular time.
In the related art, when the intelligent terminal cleans the cache data, the cleaning is generally performed on a first-in first-out basis. Specifically, a fixed data amount is set, and cached data of the fixed data amount cached first is cleaned.
In carrying out the present invention, the inventors have found that the related art has at least the following problems:
some cache data are common data of users, the users can use the common data frequently in the process of using the intelligent terminal, the common data are likely to be deleted by mistake in the process of cleaning by adopting first-in first-out, so that some data which are not meaningful to the users and are not common are left in the cache space, the utilization value of the cache space is low, and the viscosity of the users is low.
Disclosure of Invention
In view of the above, the present invention provides a method, apparatus, device and computer readable storage medium for cleaning cache data, which mainly aims to solve the problems of low utilization value of the present cache space and low user viscosity.
According to a first aspect of the present invention, there is provided a method for cleaning cache data, the method comprising:
predicting a predicted cache growth amount of a circular queue according to historical cache growth amounts of the circular queue in a plurality of historical time periods, wherein the circular queue comprises a plurality of cache nodes for storing cache data;
counting the total data amount of the cache data currently stored in the circular queue, and calculating the sum of the total data amount and the predicted cache increment;
When the sum is greater than or equal to a cleaning threshold, calculating an importance coefficient of each cache node based on the node position and the node access rate of each cache node in the circular queue;
and clearing cache data of a target cache node in the circular queue, wherein the importance coefficient of the target cache node is lower than that of other cache nodes in the circular queue.
In another embodiment, the predicting the predicted cache growth of the circular queue based on the historical cache growth of the circular queue over a plurality of historical time periods includes:
respectively counting a plurality of history cache increment amounts of the circulation queue in a plurality of history time periods, and calculating an average increment amount of the plurality of history cache increment amounts;
acquiring at least one prediction coefficient and at least one coefficient weight corresponding to the at least one prediction coefficient, and calculating at least one unit increment based on the at least one prediction coefficient and the average increment;
calculating the product of the at least one unit increment and the increment of the at least one coefficient weight respectively to obtain at least one increment product;
and calculating a product sum of the at least one increment product, calculating a weight sum of the at least one coefficient weight, and taking a first ratio of the product sum and the weight sum as the predicted cache increment.
In another embodiment, the obtaining at least one prediction coefficient and at least one coefficient weight corresponding to the at least one prediction coefficient, calculating at least one unit increment based on the at least one prediction coefficient and the average increment, includes:
for each prediction coefficient of the at least one prediction coefficient, calculating a first product of the prediction coefficient and a first historical cache growth of the plurality of historical cache growth, and calculating a second product of the prediction coefficient and the average growth;
calculating a first sum of the first product and the average increment, and taking a difference value between the first sum and the second product as a first process value;
updating the first history buffer memory increment amount in the calculation process into a second history buffer memory increment amount, replacing the average increment amount by adopting the first process value, and repeatedly executing the calculation process until the plurality of history buffer memory increment amounts are traversed to obtain the unit increment amount of the prediction coefficient, wherein the second history buffer memory increment amount is the next history buffer memory increment amount of the first history buffer memory increment amount in the plurality of history buffer memory increment amounts;
And repeatedly executing the process of generating the unit increment to obtain at least one unit increment of the at least one prediction coefficient.
In another embodiment, the calculating the importance coefficient of each cache node includes:
for each cache node in the circular queue, determining a node position of the cache node;
inquiring the data importance of the cache data stored by the cache node, and counting the node access rate of the cache node;
and determining the cleaning number, calculating a second ratio of the node position to the cleaning number, and taking the sum of the second ratio, the data importance and the node access rate as an importance coefficient of the cache node.
In another embodiment, the querying the data importance of the cache data stored by the cache node includes:
reading the data content of the cache data stored by the cache node, and determining the data type of the cache data stored by the cache node;
and inquiring the data importance corresponding to the data type as the data importance of the cache data stored by the cache node.
In another embodiment, the method further comprises:
When data to be cached is received, storing the data to be cached in an idle cache node of the circular queue, wherein the idle cache node does not store the cached data;
and moving the idle cache node to the head of the circular queue.
In another embodiment, the method further comprises:
if the data to be cached carries the expiration time, marking the data to be cached by adopting the process time, and recording the storage time length of the data to be cached in the circular queue;
and when the storage duration reaches the expiration time, cleaning the data to be cached.
According to a second aspect of the present invention, there is provided a buffered data cleaning device, the device comprising:
the prediction module is used for predicting the predicted cache increment of the circular queue according to the historical cache increment of the circular queue in a plurality of historical time periods, and the circular queue comprises a plurality of cache nodes for storing cache data;
the statistics module is used for counting the total data quantity of the cache data currently stored in the circular queue and calculating the sum of the total data quantity and the predicted cache increment;
the calculation module is used for calculating the importance coefficient of each cache node based on the node position and the node access rate of each cache node in the circular queue when the sum is larger than or equal to a cleaning threshold value;
And the cleaning module is used for cleaning the cache data of the target cache node in the circular queue, and the importance coefficient of the target cache node is lower than that of other cache nodes in the circular queue.
In another embodiment, the prediction module includes:
a statistics unit, configured to respectively count a plurality of history buffer growth amounts of the circular queue in a plurality of history time periods, and calculate an average growth amount of the plurality of history buffer growth amounts;
a first calculation unit, configured to obtain at least one prediction coefficient and at least one coefficient weight corresponding to the at least one prediction coefficient, and calculate at least one unit increment based on the at least one prediction coefficient and the average increment;
a second calculating unit, configured to calculate an increment product of the at least one unit increment and the at least one coefficient weight, to obtain at least one increment product;
and a third calculation unit, configured to calculate a product of the at least one increment product, calculate a weight sum of the at least one coefficient weight, and use a first ratio of the product and the weight sum as the prediction buffer increment.
In another embodiment, the first calculating unit is configured to calculate, for each of the at least one prediction coefficient, a first product of the prediction coefficient and a first history buffer-increment of the plurality of history buffer-increment, and calculate a second product of the prediction coefficient and the average increment; calculating a first sum of the first product and the average increment, and taking a difference value between the first sum and the second product as a first process value; updating the first history buffer memory increment amount in the calculation process into a second history buffer memory increment amount, replacing the average increment amount by adopting the first process value, and repeatedly executing the calculation process until the plurality of history buffer memory increment amounts are traversed to obtain the unit increment amount of the prediction coefficient, wherein the second history buffer memory increment amount is the next history buffer memory increment amount of the first history buffer memory increment amount in the plurality of history buffer memory increment amounts; and repeatedly executing the process of generating the unit increment to obtain at least one unit increment of the at least one prediction coefficient.
In another embodiment, the computing module includes:
a determining unit, configured to determine, for each cache node in the circular queue, a node position of the cache node;
the statistics unit is used for inquiring the data importance of the cache data stored by the cache node and counting the node access rate of the cache node;
and the calculating unit is used for determining the cleaning number, calculating a second ratio of the node position to the cleaning number, and taking the sum of the second ratio, the data importance and the node access rate as an importance coefficient of the cache node.
In another embodiment, the statistics unit is configured to read data content of the cache data stored in the cache node, and determine a data type of the cache data stored in the cache node; and inquiring the data importance corresponding to the data type as the data importance of the cache data stored by the cache node.
In another embodiment, the apparatus further comprises:
the storage module is used for storing the data to be cached in idle cache nodes of the circular queue when the data to be cached is received, wherein the idle cache nodes do not store the cached data;
And the moving module is used for moving the idle cache node to the head of the circular queue.
In another embodiment, the apparatus further comprises:
the marking module is used for marking the data to be cached by adopting the process time and recording the storage time length of the data to be cached in the circular queue if the data to be cached carries the expiration time;
and the cleaning module is further used for cleaning the data to be cached when the storage duration reaches the expiration time.
According to a third aspect of the present invention there is provided an apparatus comprising a memory storing a computer program and a processor implementing the steps of the method of the first aspect described above when the computer program is executed by the processor.
According to a fourth aspect of the present invention there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of the first aspect described above.
By means of the technical scheme, compared with the existing method for cleaning based on first-in first-out, the method, the device, the equipment and the computer readable storage medium for cleaning the cache data can predict the predicted cache increment of the circular queue, count the total data of the cache data currently stored in the circular queue, select a target cache node with a low importance coefficient to clean the cache data in the circular queue when the sum of the predicted cache increment and the total data is larger than the cleaning threshold value, ensure that the importance of the cleaned cache data is the lowest relative to a user, improve the utilization value of a cache space and enable the user viscosity to be higher.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 shows a flow chart of a method for cleaning cache data according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a method for cleaning cache data according to an embodiment of the present invention;
fig. 3A is a schematic structural diagram of a buffered data cleaning device according to an embodiment of the present invention;
fig. 3B is a schematic structural diagram of a buffered data cleaning device according to an embodiment of the present invention;
fig. 3C is a schematic structural diagram of a buffered data cleaning device according to an embodiment of the present invention;
Fig. 3D is a schematic structural diagram of a buffered data cleaning device according to an embodiment of the present invention;
fig. 3E is a schematic structural diagram of a buffered data cleaning device according to an embodiment of the present invention;
fig. 4 shows a schematic device structure of a computer device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The embodiment of the invention provides a cache data cleaning method, which can predict the predicted cache increment of a circulation queue and count the total data of the cache data currently stored in the circulation queue, when the sum of the predicted cache increment and the total data is larger than a cleaning threshold, a target cache node with a low importance coefficient is selected in the circulation queue to execute cache data cleaning, thereby achieving the purposes of ensuring that the cleaned cache data is the lowest in importance relative to a user, improving the utilization value of a cache space and having higher user viscosity, as shown in a figure 1, and the method comprises the following steps:
101. The predicted cache growth of the circular queue is predicted based on the historical cache growth of the circular queue over a plurality of historical time periods, the circular queue including a plurality of cache nodes for storing cached data.
102. And counting the total data quantity of the cache data currently stored in the circular queue, and calculating the sum of the total data quantity and the predicted cache growth quantity.
103. And when the sum is greater than or equal to the cleaning threshold value, calculating an importance coefficient of each cache node based on the node position and the node access rate of each cache node in the circular queue.
104. And cleaning the cache data of the target cache node in the circular queue, wherein the importance coefficient of the target cache node is lower than that of other cache nodes in the circular queue.
According to the method provided by the embodiment of the invention, the predicted cache increment of the circular queue can be predicted, the total data amount of the cache data currently stored in the circular queue is counted, when the sum of the predicted cache increment and the total data amount is larger than the cleaning threshold, a target cache node with a low importance coefficient is selected in the circular queue to clean the cache data, so that the importance of the cleaned cache data is the lowest relative to a user, the utilization value of a cache space is improved, and the user viscosity is higher.
The embodiment of the invention provides a cache data cleaning method, which can predict the predicted cache increment of a circulation queue and count the total data of the cache data currently stored in the circulation queue, when the sum of the predicted cache increment and the total data is larger than a cleaning threshold, a target cache node with a low importance coefficient is selected in the circulation queue to execute cache data cleaning, thereby achieving the purposes of ensuring that the cleaned cache data is the lowest in importance relative to a user, improving the utilization value of a cache space and having higher user viscosity, as shown in fig. 2, the method comprises the following steps:
201. and predicting the predicted cache growth of the circular queue according to the historical cache growth of the circular queue in a plurality of historical time periods.
In the embodiment of the invention, a circular queue for storing cache data is arranged. The medium for storing the cache data in the circular queue is a cache node, the circular queue is formed by a plurality of cache nodes, and each node stores the cache data of one service.
In order to determine the time for cleaning the cache data, statistics is required to be performed on the circular queue, and whether the cache data in the circular queue needs to be cleaned is determined according to the data amount obtained through the statistics. Considering that the growth of the cache data is peaked, that is, it is likely that the cache data will be stored in a large amount in the circular queue in a certain period of time, so that in order to prepare in advance, paralysis of the whole circular queue caused by entry of a large amount of cache data is avoided, and when the circular queue is counted, future growth of the circular queue needs to be predicted, so that the comprehensiveness and completeness of counting are ensured.
Specifically, when calculating the predicted cache growth amount, the predicted cache growth amount can be calculated through a prediction algorithm, and the method comprises the following steps one to three.
Step one, respectively counting a plurality of history buffer memory increment amounts of a circulation queue in a plurality of history time periods, and calculating an average increment amount of the plurality of history buffer memory increment amounts.
For example, let 30min be a time period division, count the buffer increment of 20 history time periods before the current time, respectively r1, r2, r3 … r20, and calculate the average increment as S 0 =(r1+r2+r3+…+r20)/20。
And step two, acquiring at least one prediction coefficient and at least one coefficient weight corresponding to the at least one prediction coefficient, and calculating at least one unit increment based on the at least one prediction coefficient and the average increment.
The at least one prediction coefficient and the at least one coefficient weight corresponding to the at least one prediction coefficient may be adjusted according to different application scenarios, for example, the at least one prediction coefficient and the at least one coefficient weight may be appropriately adjusted in a peak time period, and the at least one prediction coefficient and the at least one coefficient weight may be appropriately reduced in a valley time period. For example, the prediction coefficient may be a 0 =0.1,a 1 =0.3,a 2 =0.5,a 3 =0.4, the coefficient weight may be w 0 =w 1 =w 2 =w 3 =0.25. In this way, at least one unit increment may be calculated based on at least one prediction coefficient and the average increment.
In addition, in addition to the first prediction, the coefficient weight needs to be dynamically adjusted before each prediction, and the above method for representing the coefficient is taken as an example, and the specific adjustment process is as follows: will r 20 Respectively with S 210 、S 211 、S 212 、S 213 Compare with r 20 The coefficient weight with the smallest value difference is multiplied by 1.05 and is then compared with r 20 The coefficient weight with the largest difference is multiplied by 0.95, thereby realizing the adjustment of the coefficient weight.
Specifically, in calculating the unit increment, for each of the at least one prediction coefficient, first, a first product of the prediction coefficient and a first history buffer increment among the plurality of history buffer increments is calculated, and a second product of the prediction coefficient and the average increment is calculated; then, a first sum of the first product and the average increment is calculated, and a difference between the first sum and the second product is used as a first process value. And continuously updating the first history buffer memory increment in the calculation process into a second history buffer memory increment, namely updating the second history buffer memory increment into the next history buffer memory increment of the first history buffer memory increment in a plurality of history buffer memory increment, replacing the average increment by adopting a first process value, and repeatedly executing the calculation process until the plurality of history buffer memory increment are traversed to obtain the unit increment of the prediction coefficient.
Continuing with the example in step one and step two, the above process formula may be expressed as: first, calculate S 10 =a 0 ×r 1 +(1-a 0 )×S 0 The method comprises the steps of carrying out a first treatment on the surface of the Subsequently, calculate S 20 =a 0 ×r 2 +(1-a 0 )×S 10 The method comprises the steps of carrying out a first treatment on the surface of the Continue to calculate S 30 =a 0 ×r 3 +(1-a 0 )×S 20 The method comprises the steps of carrying out a first treatment on the surface of the And so on until S is calculated 210 =a 0 ×r 20 +(1-a 0 )×S 200 . By repeatedly performing the above-described process of generating the unit increment, at least one unit increment of at least one prediction coefficient is obtained, e.g., S 211 =a 1 ×r 20 +(1-a 1 )×S 200 ;S 212 =a 2 ×r 20 +(1-a 2 )×S 200 ;S 213 =a 3 ×r 20 +(1-a 3 )×S 200 Etc.
And thirdly, generating a predicted cache growth according to the at least one unit growth and the at least one coefficient weight.
After calculating the at least one unit increment, a predicted cache increment may be generated based on the at least one unit increment and the at least one coefficient weight. Firstly, calculating the product of at least one unit increment and at least one increment of coefficient weight respectively to obtain at least one increment product; then, a product sum of at least one increment product is calculated, a weight sum of at least one coefficient weight is calculated, and a first ratio of the product sum and the weight sum is taken as a predicted cache increment. The above calculation process can be expressed by the following equation 1:
equation 1: kz=s 210 ×w 0 +S 211 ×w 1 +S 212 ×w 2 +S 213 ×w 3 /(w 0 +w 1 +w 2 +w 3 )
Wherein Kz is the predicted cache growth.
Through the processes from the first step to the third step, the predicted cache increment of the circular queue can be obtained, and then the cleaning of the cache data is executed based on the predicted cache increment.
202. And counting the total data quantity of the cache data currently stored in the circular queue, and calculating the sum of the total data quantity and the predicted cache growth quantity.
In the embodiment of the invention, after the predicted cache increment of the circular queue is obtained, the total data amount of the cache data currently stored in the circular queue can be counted, and the sum of the total data amount and the predicted cache increment is calculated. When the total data amount of the cache data currently stored in the circular queue is counted, the sub-data amount of each cache node can be counted first, and the sum of the sub-data amounts is calculated, so that the total data amount is obtained.
203. And when the sum is greater than or equal to the cleaning threshold value, calculating an importance coefficient of each cache node based on the node position and the node access rate of each cache node in the circular queue.
In the embodiment of the present invention, considering that if the cleaning of data is performed only in a state where the circular queue is already saturated, some important cache data is likely to be unable to be written into the circular queue for a long time, in order to determine the timing of performing the cleaning of the cache data, a cleaning threshold for evaluating the sum may be set, and when the sum is equal to or greater than the cleaning threshold, the cleaning process of the cache data may be performed. The cleaning threshold value can be a self-defined value, and can be specifically adjusted according to different scenes. In addition, since the prediction process shown in step 201 may be performed multiple times, the cleaning threshold needs to be adjusted in each prediction, that is, each subsequent prediction needs to dynamically adjust the cleaning threshold except for the first prediction. Specifically, if the predicted cache growth is greater than the sum of the plurality of historical cache growth, adjusting the cleaning threshold to be 0.99 times the original cleaning threshold; if the predicted cache growth is less than the sum of the plurality of historical cache growth, the cleaning threshold is adjusted to be 1.01 times the original cleaning threshold. It should be noted that, regardless of how the cleaning threshold is adjusted, the value of the cleaning threshold is smaller than the limit value.
When the sum is greater than or equal to the cleaning threshold, the result indicates that the cache data of the circulation queue needs to be cleaned at the moment. Because the cache data stored in the circular queue is important for the user, the user can access the cache data frequently, and the cache data is not important for the user, and the user can only access the cache data once, the importance coefficient of the cache data stored in each node needs to be calculated, and the cache data is cleaned according to the importance coefficient. Specifically, for each cache node in the circular queue, first, determining a node position of the cache node; then, reading the data content of the cache data stored by the cache node, determining the data type of the cache data stored by the cache node, inquiring the data importance corresponding to the data type as the data importance of the cache data stored by the cache node, and counting the node access rate of the cache node; and finally, determining the cleaning number, calculating a second ratio of the node positions to the cleaning number, and using the second ratio to represent the weight occupied by the node positions of the cache nodes in the whole cache cleaning, so that the node positions are comprehensively considered, and taking the sum of the second ratio, the data importance and the node access rate as an importance coefficient of the cache nodes. The above calculation process can be expressed by the following equation 2:
Equation 2: c=0.4×n/m+0.2k+0.4×r
Wherein C is used to represent an important coefficient.
N is used to indicate the node position, and it should be noted that, in order to ensure that the importance of the cache node arranged in front of the circular queue is higher than the importance of the cache node arranged in back, when the node position is recorded, the record is performed in a manner described in the foregoing, that is, the value of the node position of the cache node arranged in the first position of the circular queue is the largest, and the value of the node position of the cache node arranged in the last position of the circular queue is the smallest. For example, assuming that the current circular queue includes 10 cache nodes, if node a is arranged at the first position of the circular queue, the node position of the node a, that is, the value of N of the node a is 10; if the node B is queued at the end of the circular queue, the node position of the node B, i.e., the N of the node B, takes a value of 1.
M is used to represent the number of cleanups. When cleaning the cache data, the cache data in the nodes are cleaned according to a fixed number, so that an M used for indicating the cleaning number can be set, and the cache data in the M nodes can be cleaned each time. M may be expressed in terms of a percentage, for example, 5%, and may be manually set, M being generally greater than the remaining capacity in the circular queue.
K is used to represent the importance of the data. K is defined according to the data type of the traffic of the data. The data type is the data importance of the common data which is 1, the data type is the importance of the core data which is 3, and the data type is the importance of the financial transaction data which is 5. The importance of the data is set manually and can be adjusted according to different scenes.
R is used to represent the node access rate. The access rate is used to indicate which data is accessed frequently, and to try to clean up some data that is not accessed frequently. When the node access rate is calculated, first access times of the current node are counted; then, the total number of times all nodes are accessed is calculated; and finally, calculating the ratio of the first access times to the total times, and taking the ratio as the node access rate.
After the values of the unknowns are respectively determined, the importance coefficient corresponding to each node can be calculated for each node, so that the cache data in which nodes are cleaned can be determined according to the importance coefficient.
204. And cleaning the cache data of the target cache node in the circular queue.
In the embodiment of the invention, after the important coefficient of each node is generated, the cache nodes in the circular queue can be ordered according to the important coefficient from large to small, so as to obtain an ordering result. After the sorting result is obtained, the buffer nodes with the clearing number, the important coefficients of which are arranged at the tail end of the sorting result, can be used as target buffer nodes, so that the important coefficients of the target buffer nodes are lower than those of other buffer nodes in the circular queue, and the buffer data in the target buffer nodes are cleared. For example, if the cleaning number is 2 and the sorting result is A, C, D, B, the cache data in D and B can be cleaned.
In the practical application process, because the node position is considered when the important coefficient of the cache node is calculated, in order to avoid that newly written cache data is cleaned up for a certain time without being stored, when the data to be cached is received, the data to be cached is stored in the idle cache node of the circular queue, namely the cache node which does not store any cache data, and the idle cache node is moved to the head of the circular queue, so that the important coefficient of each cache node is ensured to be in accordance with the practice. In addition, some cache data are data which are frequently interacted and accessed by users, so that the data to be cached are the cache data which are already stored in the circular queue, namely the data to be cached hit the existing cache data in the circular queue, and therefore, the cache node where the data to be cached are located in the circular queue can be determined to be directly moved to the head of the circular queue without storing the data to be cached again.
It should be noted that if the data to be cached carries the expiration time, the data to be cached is marked by adopting the process time, the storage time of the data to be cached in the circular queue is recorded, and when the storage time reaches the expiration time, the data to be cached is cleaned. Furthermore, some data to be cached do not carry the expiration time, so as to avoid the situation that some data to be cached cannot be cleaned for a long time, and the cache space in the terminal is occupied for a long time, therefore, the expiration time can be customized when the data to be cached is written, the storage time of the data to be cached stored in the circular queue is recorded at the moment of writing, and when the storage time is longer than or equal to the expiration time, the data to be cached is cleaned.
According to the method provided by the embodiment of the invention, the predicted cache increment of the circular queue can be predicted, the total data amount of the cache data currently stored in the circular queue is counted, when the sum of the predicted cache increment and the total data amount is larger than the cleaning threshold, a target cache node with a low importance coefficient is selected in the circular queue to clean the cache data, so that the importance of the cleaned cache data is the lowest relative to a user, the utilization value of a cache space is improved, and the user viscosity is higher.
Further, as a specific implementation of the method shown in fig. 1, an embodiment of the present invention provides a device for cleaning cache data, as shown in fig. 3A, where the device includes: a prediction module 301, a statistics module 302, a calculation module 303 and a cleaning module 304.
The prediction module 301 is configured to predict a predicted cache growth amount of a circular queue according to a historical cache growth amount of the circular queue in a plurality of historical time periods, where the circular queue includes a plurality of cache nodes for storing cache data;
the statistics module 302 is configured to count a total amount of data of the cache data currently stored in the circular queue, and calculate a sum of the total amount of data and the predicted cache growth;
The calculating module 303 is configured to calculate, when the sum is greater than or equal to a cleaning threshold, an importance coefficient of each cache node based on a node position and a node access rate of each cache node in the circular queue;
the cleaning module 304 is configured to clean cache data of a target cache node in the circular queue, where an importance coefficient of the target cache node is lower than an importance coefficient of other cache nodes in the circular queue.
In a specific application scenario, as shown in fig. 3B, the prediction module 301 specifically includes: a statistics unit 3011, a first calculation unit 3012, a second calculation unit 3013, and a third calculation unit 3014.
The statistics unit 3011 is configured to respectively count a plurality of history buffer growth amounts of the circular queue in a plurality of history time periods, and calculate an average growth amount of the plurality of history buffer growth amounts;
the first calculating unit 3012 is configured to obtain at least one prediction coefficient and at least one coefficient weight corresponding to the at least one prediction coefficient, and calculate at least one unit increment based on the at least one prediction coefficient and the average increment;
the second calculating unit 3013 is configured to calculate growth products of the at least one unit growth and the at least one coefficient weight, so as to obtain at least one growth product;
The third calculating unit 3014 is configured to calculate a product of the at least one increment product, calculate a weight sum of the at least one coefficient weight, and use a first ratio of the product and the weight sum as the predicted cache increment.
In a specific application scenario, the first calculating unit 3012 is configured to calculate, for each prediction coefficient in the at least one prediction coefficient, a first product of the prediction coefficient and a first history buffer increment in the plurality of history buffer increment, and calculate a second product of the prediction coefficient and the average increment; calculating a first sum of the first product and the average increment, and taking a difference value between the first sum and the second product as a first process value; updating the first history buffer memory increment amount in the calculation process into a second history buffer memory increment amount, replacing the average increment amount by adopting the first process value, and repeatedly executing the calculation process until the plurality of history buffer memory increment amounts are traversed to obtain the unit increment amount of the prediction coefficient, wherein the second history buffer memory increment amount is the next history buffer memory increment amount of the first history buffer memory increment amount in the plurality of history buffer memory increment amounts; and repeatedly executing the process of generating the unit increment to obtain at least one unit increment of the at least one prediction coefficient.
In a specific application scenario, as shown in fig. 3C, the computing module 303 specifically includes: a determining unit 3031, a counting unit 3032 and a calculating unit 3033.
The determining unit 3031 is configured to determine, for each cache node in the circular queue, a node location of the cache node;
the statistics unit 3032 is configured to query the data importance of the cache data stored by the cache node, and count the node access rate of the cache node;
the calculating unit 3033 is configured to determine a cleaning number, calculate a second ratio of the node position to the cleaning number, and use a sum of the second ratio, the data importance and the node access rate as an importance coefficient of the cache node.
In a specific application scenario, the statistics unit 3032 is configured to read data content of the cache data stored in the cache node, and determine a data type of the cache data stored in the cache node; and inquiring the data importance corresponding to the data type as the data importance of the cache data stored by the cache node.
In a specific application scenario, as shown in fig. 3D, the apparatus further includes: a storage module 305 and a movement module 306.
The storage module 305 is configured to store, when data to be cached is received, the data to be cached in an idle cache node of the circular queue, where the idle cache node does not store the cached data;
the moving module 306 is configured to move the idle cache node to a head of the circular queue.
In a specific application scenario, as shown in fig. 3E, the apparatus further includes: the marking module 307.
The marking module 307 is configured to mark the data to be cached by using the process time if the data to be cached carries an expiration time, and record a storage duration of the data to be cached in the circular queue;
the cleaning module 304 is further configured to clean the data to be cached when the storage duration reaches the expiration time.
The device provided by the embodiment of the invention can predict the predicted cache increment of the circulation queue, count the total data amount of the cache data currently stored in the circulation queue, select the target cache node with low importance coefficient to execute cache data cleaning in the circulation queue when the sum of the predicted cache increment and the total data amount is larger than the cleaning threshold value, ensure that the importance of the cleaned cache data is the lowest relative to the user, improve the utilization value of the cache space and ensure higher user viscosity.
It should be noted that, for other corresponding descriptions of each functional unit related to the cache data cleaning device provided by the embodiment of the present application, reference may be made to corresponding descriptions in fig. 1 and fig. 2, and details are not repeated here.
In an exemplary embodiment, referring to fig. 4, there is further provided a device 400 including a communication bus, a processor, a memory, and a communication interface, and may further include an input-output interface, and a display device, wherein the functional units may communicate with each other via the bus. The memory stores a computer program and a processor, which is configured to execute the program stored in the memory and execute the cache data cleaning method in the above embodiment.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the cache data cleaning method.
From the above description of the embodiments, it will be clear to those skilled in the art that the present application may be implemented in hardware, or may be implemented by means of software plus necessary general hardware platforms. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective implementation scenario of the present application.
Those skilled in the art will appreciate that the drawing is merely a schematic illustration of a preferred implementation scenario and that the modules or flows in the drawing are not necessarily required to practice the application.
Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The above-mentioned inventive sequence numbers are merely for description and do not represent advantages or disadvantages of the implementation scenario.
The foregoing disclosure is merely illustrative of some embodiments of the application, and the application is not limited thereto, as modifications may be made by those skilled in the art without departing from the scope of the application.

Claims (8)

1. The cache data cleaning method is characterized by comprising the following steps:
respectively counting a plurality of historical cache increment amounts of a circular queue in a plurality of historical time periods, and calculating an average increment amount of the plurality of historical cache increment amounts, wherein the circular queue comprises a plurality of cache nodes for storing cache data;
For each of at least one prediction coefficient, calculating a first product of the prediction coefficient and a first historical cache growth of the plurality of historical cache growth, and calculating a second product of the prediction coefficient and the average growth; calculating a first sum of the first product and the average increment, and taking a difference value between the first sum and the second product as a first process value;
updating the first history buffer memory increment amount in the calculation process into a second history buffer memory increment amount, replacing the average increment amount by adopting the first process value, and repeatedly executing the calculation process until the plurality of history buffer memory increment amounts are traversed to obtain the unit increment amount of the prediction coefficient, wherein the second history buffer memory increment amount is the next history buffer memory increment amount of the first history buffer memory increment amount in the plurality of history buffer memory increment amounts;
repeatedly executing the process of calculating the unit increment to obtain at least one unit increment of the at least one prediction coefficient;
calculating the product of the at least one unit increment and the increment of the at least one coefficient weight respectively to obtain at least one increment product;
Calculating a product sum of the at least one increment product, calculating a weight sum of the at least one coefficient weight, and taking a first ratio of the product sum and the weight sum as a predicted cache increment;
counting the total data amount of the cache data currently stored in the circular queue, and calculating the sum of the total data amount and the predicted cache increment;
when the sum is greater than or equal to a cleaning threshold, calculating an importance coefficient of each cache node based on the node position and the node access rate of each cache node in the circular queue;
and clearing cache data of a target cache node in the circular queue, wherein the importance coefficient of the target cache node is lower than that of other cache nodes in the circular queue.
2. The method of claim 1, wherein said calculating the importance coefficients of each cache node comprises:
for each cache node in the circular queue, determining a node position of the cache node;
inquiring the data importance of the cache data stored by the cache node, and counting the node access rate of the cache node;
and determining the cleaning number, calculating a second ratio of the node position to the cleaning number, and taking the sum of the second ratio, the data importance and the node access rate as an importance coefficient of the cache node.
3. The method of claim 2, wherein querying the data importance of the cache data stored by the cache node comprises:
reading the data content of the cache data stored by the cache node, and determining the data type of the cache data stored by the cache node;
and inquiring the data importance corresponding to the data type as the data importance of the cache data stored by the cache node.
4. The method according to claim 1, wherein the method further comprises:
when data to be cached is received, storing the data to be cached in an idle cache node of the circular queue, wherein the idle cache node does not store the cached data;
and moving the idle cache node to the head of the circular queue.
5. The method according to claim 4, wherein the method further comprises:
if the data to be cached carries the expiration time, marking the data to be cached by adopting the process time, and recording the storage time length of the data to be cached in the circular queue;
and when the storage duration reaches the expiration time, cleaning the data to be cached.
6. A cache data cleaning device, comprising:
a prediction module for: respectively counting a plurality of historical cache increment amounts of a circular queue in a plurality of historical time periods, and calculating an average increment amount of the plurality of historical cache increment amounts, wherein the circular queue comprises a plurality of cache nodes for storing cache data;
for each of at least one prediction coefficient, calculating a first product of the prediction coefficient and a first historical cache growth of the plurality of historical cache growth, and calculating a second product of the prediction coefficient and the average growth; calculating a first sum of the first product and the average increment, and taking a difference value between the first sum and the second product as a first process value;
updating the first history buffer memory increment amount in the calculation process into a second history buffer memory increment amount, replacing the average increment amount by adopting the first process value, and repeatedly executing the calculation process until the plurality of history buffer memory increment amounts are traversed to obtain the unit increment amount of the prediction coefficient, wherein the second history buffer memory increment amount is the next history buffer memory increment amount of the first history buffer memory increment amount in the plurality of history buffer memory increment amounts;
Repeatedly executing the process of calculating the unit increment to obtain at least one unit increment of the at least one prediction coefficient;
calculating the product of the at least one unit increment and the increment of the at least one coefficient weight respectively to obtain at least one increment product;
calculating a product sum of the at least one increment product, calculating a weight sum of the at least one coefficient weight, and taking a first ratio of the product sum and the weight sum as a predicted cache increment;
the statistics module is used for counting the total data quantity of the cache data currently stored in the circular queue and calculating the sum of the total data quantity and the predicted cache increment;
the calculation module is used for calculating the importance coefficient of each cache node based on the node position and the node access rate of each cache node in the circular queue when the sum is greater than or equal to a cleaning threshold value;
and the cleaning module is used for cleaning the cache data of the target cache node in the circular queue, and the importance coefficient of the target cache node is lower than that of other cache nodes in the circular queue.
7. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 5 when the computer program is executed.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 5.
CN201910780198.2A 2019-08-22 2019-08-22 Cache data cleaning method, device, equipment and computer readable storage medium Active CN110674121B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910780198.2A CN110674121B (en) 2019-08-22 2019-08-22 Cache data cleaning method, device, equipment and computer readable storage medium
PCT/CN2019/118233 WO2021031408A1 (en) 2019-08-22 2019-11-13 Cached data clearing method, device, equipment, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910780198.2A CN110674121B (en) 2019-08-22 2019-08-22 Cache data cleaning method, device, equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN110674121A CN110674121A (en) 2020-01-10
CN110674121B true CN110674121B (en) 2023-08-22

Family

ID=69075526

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910780198.2A Active CN110674121B (en) 2019-08-22 2019-08-22 Cache data cleaning method, device, equipment and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN110674121B (en)
WO (1) WO2021031408A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674121B (en) * 2019-08-22 2023-08-22 平安科技(深圳)有限公司 Cache data cleaning method, device, equipment and computer readable storage medium
CN111338581B (en) * 2020-03-27 2020-11-17 上海天天基金销售有限公司 Data storage method and device based on cloud computing, cloud server and system
CN111858508B (en) * 2020-06-17 2023-01-31 远光软件股份有限公司 Regulation and control method and device of log system, storage medium and electronic equipment
CN112579652B (en) * 2020-12-28 2024-04-09 咪咕文化科技有限公司 Method and device for deleting cache data, electronic equipment and storage medium
CN112632347B (en) * 2021-01-14 2024-01-23 加和(北京)信息科技有限公司 Data screening control method and device and nonvolatile storage medium
CN112783886B (en) * 2021-03-12 2023-08-29 中国平安财产保险股份有限公司 Cache cleaning method, device, computer equipment and storage medium
US11922026B2 (en) 2022-02-16 2024-03-05 T-Mobile Usa, Inc. Preventing data loss in a filesystem by creating duplicates of data in parallel, such as charging data in a wireless telecommunications network
CN116977146B (en) * 2023-08-25 2024-02-09 山东省环科院环境工程有限公司 Instrument data management and control system for environmental protection monitoring based on Internet of things

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226881A (en) * 2013-03-28 2013-07-31 马钢控制技术有限责任公司 Solving method for insufficient storage capacity of blacklist of POS (Point-of-sale) machine
CN105045723A (en) * 2015-06-26 2015-11-11 深圳市腾讯计算机系统有限公司 Processing method, apparatus and system for cached data
CN105095107A (en) * 2014-05-04 2015-11-25 腾讯科技(深圳)有限公司 Buffer memory data cleaning method and apparatus
CN109542802A (en) * 2018-11-26 2019-03-29 努比亚技术有限公司 Data cached method for cleaning, device, mobile terminal and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060074990A1 (en) * 2004-09-28 2006-04-06 International Business Machines Corporation Leaf avoidance during garbage collection in a Java Virtual Machine
CN101630291B (en) * 2009-08-03 2012-11-14 中国科学院计算技术研究所 Virtual memory system and method thereof
US9116958B2 (en) * 2012-12-07 2015-08-25 At&T Intellectual Property I, L.P. Methods and apparatus to sample data connections
CN107346289A (en) * 2016-05-05 2017-11-14 北京自动化控制设备研究所 A kind of method with round-robin queue's buffered data
CN106227598A (en) * 2016-07-20 2016-12-14 浪潮电子信息产业股份有限公司 A kind of recovery method of cache resources
US10223270B1 (en) * 2017-09-06 2019-03-05 Western Digital Technologies, Inc. Predicting future access requests by inverting historic access requests in an object storage system
CN109491619A (en) * 2018-11-21 2019-03-19 浙江中智达科技有限公司 Caching data processing method, device and system
CN110674121B (en) * 2019-08-22 2023-08-22 平安科技(深圳)有限公司 Cache data cleaning method, device, equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226881A (en) * 2013-03-28 2013-07-31 马钢控制技术有限责任公司 Solving method for insufficient storage capacity of blacklist of POS (Point-of-sale) machine
CN105095107A (en) * 2014-05-04 2015-11-25 腾讯科技(深圳)有限公司 Buffer memory data cleaning method and apparatus
CN105045723A (en) * 2015-06-26 2015-11-11 深圳市腾讯计算机系统有限公司 Processing method, apparatus and system for cached data
CN109542802A (en) * 2018-11-26 2019-03-29 努比亚技术有限公司 Data cached method for cleaning, device, mobile terminal and storage medium

Also Published As

Publication number Publication date
CN110674121A (en) 2020-01-10
WO2021031408A1 (en) 2021-02-25

Similar Documents

Publication Publication Date Title
CN110674121B (en) Cache data cleaning method, device, equipment and computer readable storage medium
JP5838229B2 (en) Send product information based on determined preference values
CN104699422B (en) Data cached determination method and device
CN106649349A (en) Method, device and system for data caching, applicable to game application
CN104915319A (en) System and method of caching information
CN109284240A (en) Memory integrated circuit and its forecasting method
CN111858403B (en) Cache data heat management method and system based on probability to access frequency counting
CN108319598A (en) data cache method, device and system
CN108389070A (en) A kind of customer action characteristic analysis method, server and system
EP3080955A1 (en) Method and apparatus of determining time for sending information
CN112561142B (en) Queuing information inquiry system
CN111737632A (en) Queuing time determining method, device, server and computer readable storage medium
CN105960790A (en) Method for caching
CN109597800A (en) A kind of log distribution method and device
CN109542612A (en) A kind of hot spot keyword acquisition methods, device and server
CN102354385A (en) Mobile terminal, server and security information pushing method
US20110107268A1 (en) Managing large user selections in an application
CN109977074B (en) HDFS-based LOB data processing method and device
JP2018511131A (en) Hierarchical cost-based caching for online media
CN109769135B (en) Online video cache management method and system based on joint request rate
CN110830809A (en) Video content heat determination method, electronic device and storage medium
CN109446111A (en) Memory integrated circuit and its prefetch address decision method
CN114025017B (en) Network edge caching method, device and equipment based on deep circulation reinforcement learning
CN109150819A (en) A kind of attack recognition method and its identifying system
CN112446490A (en) Network training data set caching method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40020235

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant