WO2022156452A1

WO2022156452A1 - Cache management method and apparatus, and device

Info

Publication number: WO2022156452A1
Application number: PCT/CN2021/139427
Authority: WO
Inventors: 胥皇; 单卫华
Original assignee: 华为云计算技术有限公司
Priority date: 2021-01-21
Filing date: 2021-12-18
Publication date: 2022-07-28
Also published as: CN114817319A

Abstract

A cache management method, the method being used in a cache management apparatus, and the cache management method comprising: after receiving a first data write request, training a cache write prediction model on the basis of relevant parameters of first data, and writing the first data to the cache; and then receiving a second data write request and, on the basis of the cache write prediction model, determining whether to write the second data to the cache. The present cache management method uses a data-driven algorithm for online training of the cache write protection model, effectively increasing the cache hit rate.

Description

A cache management method, device and device

technical field

The present application relates to the field of cache, and in particular, to a cache management method, apparatus and device.

Background technique

Application-level caching is a common storage component used to accelerate data access in systems such as databases, content distribution networks, and data storage. The main function of the cache is to temporarily save data that will be repeatedly accessed by the system in the near future, thereby reducing the average latency of data access. Cache media has fast access (read and write) speed but small storage capacity. The core competitiveness of cache is high hit rate. Among them, the hit rate indicates the proportion of data accessing the cache and getting the returned data.

Therefore, how to improve the cache hit rate has become the most concerned issue in the industry.

SUMMARY OF THE INVENTION

The present application provides a cache management method, which can improve the cache hit rate.

A first aspect of the present application provides a cache management method. The cache management method is used in a cache management device, and the method includes: receiving a first data write request, where the first data write request is used to request to write data stored in a hard disk. Write the first data into the cache; train the cache to write the prediction model according to the relevant parameters of the first data; write the first data into the cache; receive a second data write request, the second data write request is used for Request to write the second data in the hard disk into the cache; according to the cache write prediction model, determine whether to write the second data into the cache.

The cache management method improves the write probability prediction accuracy of the data to be written and improves the cache hit rate by training the cache write prediction model online.

In some possible designs, the method further includes: receiving the first batch of data write requests; and determining the first data write request and the second data write request from the first batch of data write requests according to a sampling rule input request.

In some possible designs, the method further includes: obtaining an identifier of the data to be written carried by each data write request in the first batch of data write requests; determining each data write in the first batch of data write requests If the hash value of the identifier of the data to be written carried by the request can be divisible by the sampling value, it is the first data write request; determine the value of the data to be written carried by each data write request in the first batch of data write requests. It is the second data write request that the identified hash value is not divisible by the sample value.

In some possible designs, the method further includes: receiving a second batch of data write requests; and determining, according to the trained cache write prediction model, whether to write the second batch of data into the to-be-written data corresponding to the second batch of data write requests At least one data is written to this cache.

In some possible designs, the method further includes: at least one of an average size of data in the cache, a historical total number of write requests per cycle, a historical total number of read requests per cycle, and an average elimination cycle of data in the cache, and The relevant parameters of the first data are used as the training input of the cache write prediction model; according to the request situation and the elimination situation of the first data in the average elimination period after the first data request occurs, determine the first data The write probability is used as the training output of the cache to write the prediction model: according to the training input and the training output, the training cache writes the prediction model.

In some possible designs, the relevant parameters of the first data include at least one of the following: the number of historically requested writes of the first data, and the number of historically requested reads of the first data.

In some possible designs, the method further includes: obtaining a write threshold; obtaining a write probability of the second data according to the cache write prediction model; according to the write threshold and the write probability of the second data, Determine whether to write the second data to the cache.

In some possible designs, the method further includes: at least one of the average size of the data in the cache, the average elimination period of the data in the cache, the historical total number of write requests per cycle, and the historical total number of read requests per cycle, The parameter related to the second data is used as the prediction input of the cache write prediction model; the prediction input is used as the input of the trained cache write prediction model to obtain the write probability of the second data.

A second aspect of the present application provides a cache management device, the device includes a communication unit and a processing unit: the communication unit is configured to receive a first data write request, and the first data write request is used to request that a hard disk be stored in a The first data is written into the cache; the processing unit is used for training the cache to write the prediction model according to the relevant parameters of the first data; the first data is written into the cache; the communication unit is also used for receiving the second a data write request, the second data write request is used to request to write the second data in the hard disk into the cache; the processing unit is also used to determine whether to write the second data according to the cache write prediction model write to the cache.

In some possible designs, the communication unit is configured to receive the first batch of data write requests; the processing unit is configured to determine the first data write request from the first batch of data write requests according to a division rule and the second data write request.

In some possible designs, the processing unit is configured to obtain the identifier of the data to be written carried by each data write request in the first batch of data write requests; determine each data write request in the first batch of data write requests The hash value of the identifier of the data to be written carried by the incoming request can be divisible by the sampling value is the first data write request; determine the data to be written carried by each data write request in the first batch of data write requests The hash value of the identifier that is not divisible by the sample value is the second data write request.

In some possible designs, the communication unit is configured to receive the second batch of data write requests; the processing unit is configured to determine whether to write the second batch of data to the request according to the trained cache write prediction model At least one piece of the corresponding data to be written is written into the cache.

In some possible designs, the processing unit is used for at least one of the average size of data in the cache, the historical total number of write requests per cycle, the historical total number of read requests per cycle, and the average elimination cycle of data in the cache, and the relevant parameters of the first data as the training input of the cache write prediction model; according to the request situation and the elimination situation of the first data in the average elimination period after the first data request occurs, determine the first data The write probability of is used as the training output of the cache to write the prediction model: according to the training input and the training output, the training cache writes the prediction model.

In some possible designs, the communication unit is configured to obtain a write threshold; the processing unit is configured to obtain a write probability of the second data according to the cache write prediction model; according to the write threshold and the first The writing probability of the second data determines whether to write the second data into the cache.

In some possible designs, the processing unit is configured to at least one of the average size of data in the cache, the average elimination cycle of data in the cache, the historical total number of write requests per cycle, and the historical total number of read requests per cycle , and parameters related to the second data are used as the prediction input of the cache write prediction model; the prediction input is used as the input of the trained cache write prediction model to obtain the write probability of the second data.

A third aspect of the present application provides a computing device cluster, including at least one computing device, each computing device including a processor and a memory; the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device, to cause the computing device to perform the method as provided by the first aspect or any possible design of the first aspect.

A fourth aspect of the present application provides a computer program product comprising instructions which, when executed by a cluster of computer devices, cause the cluster of computer devices to perform a method as provided by the first aspect or any possible design of the first aspect.

A fifth aspect of the present application provides a computer-readable storage medium comprising computer program instructions, when the computer program instructions are executed by a cluster of computing devices, the cluster of computing devices executes the first aspect or any possible implementation of the first aspect Design provides methods.

Description of drawings

In order to illustrate the technical methods of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings used in the embodiments.

FIG. 1 is a schematic diagram of a possible application scenario applicable to the embodiment of the present application.

FIG. 2 is a flowchart of a possible cache management method applicable to the embodiment of the present application.

FIG. 3 is a schematic structural diagram of a possible cache management apparatus applicable to the embodiment of the present application.

FIG. 4 is a schematic structural diagram of a possible computing device suitable for an embodiment of the present application.

FIG. 5 is a schematic structural diagram of a possible computing device cluster applicable to the embodiment of the present application.

FIG. 6 is a schematic structural diagram of a possible computing device cluster applicable to the embodiment of the present application.

FIG. 7 is a schematic structural diagram of a possible computing device cluster applicable to the embodiment of the present application.

Detailed ways

The terms "first" and "second" in the embodiments of the present application are only used for the purpose of description, and cannot be understood as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature defined as "first" or "second" may expressly or implicitly include one or more of that feature.

First, some technical terms involved in the embodiments of this application are introduced.

Cache: The middle layer between the fast storage medium and the service system, used to temporarily store recently accessed data and reduce data access latency.

Cache write: The process of deciding whether to write data to the cache when a data write request arrives at the cache system.

Cache hit rate: When the initial user accesses the acceleration node, if the node caches the data to be accessed, it is called a hit. If not, you need to go back to the original server to fetch, that is, there is no hit. The process of fetching data is synchronized with user access, so even if new data is fetched again, users will not feel any delay. The hit rate is equal to the sum of hits over hits and no hits. The cache hit rate is one of the important factors for judging the acceleration effect.

The access speed of the cache medium is usually faster than the access speed of the system main memory. In a typical scenario, the access speed differs by several orders of magnitude. In the application system, the memory is usually used as the cache medium of the hard disk and the remote network. Relative to the data scale of the application system, the available storage capacity of the cache medium is extremely small. For example, in a typical Internet service database, the database usually stores petabytes (petabytes, PB) of data, but the cache capacity is only several gigabytes (gigabyte, GB), or even less than 1GB.

Due to the fast access speed of the cache medium, the higher the proportion of data accessed by the system hits in the cache, the better the performance of the system is generally. However, due to the limited data storage capacity of the cache medium, the cache also needs to continuously eliminate data to free up storage space to cache newly written data. Common cache data elimination algorithms include least recently used (LRU) and first-in, first-out (FIFO) methods. These methods tend to retain new data and obsolete old data.

In the current mainstream cloud services and Internet services, the frequency of data requests usually obeys the long-tail distribution, most of the access requests are concentrated on a small part of the data, and most of the data is accessed only once. Under this kind of data distribution, a large amount of new requested data is continuously written into the memory, and the old data is continuously eliminated, but the newly written data will not have a second access, so the cache hit rate is often low. On the other hand, in Internet services, linear traversal of data is also a common business. Data is scanned linearly one by one, and there will be no second access during the scan. In addition, various cyberattacks often attempt to scan the system's full set of data. In scenarios such as long-tail distribution and linear scan, a large amount of data is accessed only once in a short period of time. The mainstream cache elimination algorithm will lead to the problem of rapid flushing (also known as thrashing) of the cache medium, that is, the data is continuously written to the cache, and the hit rate is extremely low .

In this case, the introduction of the cache write function can usually reduce the cache flushing speed and improve the hit rate. Cache write refers to judging whether the data has the value of entering the cache system before the data enters the cache medium, and if it has no value, the data will not be put into the cache medium.

Currently, there are related methods for managing cache writes.

For example, in order to cope with linear scans, some systems do not allow data accessed for the first time recently to enter the cache, and data accessed more than the second time are written to the cache. In addition, there are some systems that, based on statistical information, such as counting the number of recent accesses to each piece of data, set a threshold. If the access exceeds the threshold, it will be written to the cache, otherwise it will be blocked. There are also rules-based or frequency-based writing techniques.

Often these methods have poor business adaptability, and often little is known about the data that will actually enter the system until the system goes live. In this case, formulating rules in advance, or deciding on statistical indicators and thresholds, produces results that are difficult to predict.

In addition, in common application systems, the distribution of data will change according to business needs and changes in scenarios. At this time, if the writing scheme cannot be adjusted following the change of data, the cache hit rate is likely to decrease.

In view of this, the embodiments of the present application provide a technical solution capable of adaptively adjusting a write strategy, that is, an online self-adaptive-based cache write method. The data-driven algorithm automatically predicts the future hit probability of data when the system is running, and learns the writing model online to quickly adapt to changes in business data and achieve accurate and efficient writing results.

In order to make the technical solution of the present application clearer and easier to understand, the scenario of the cache writing method 200 is introduced below with reference to FIG. 1 .

In a possible implementation manner, after the user 101 clicks on the application 102 , the application 102 will be triggered to initiate a data read request to the cache 103 . Wherein, the application 102 may be a web application, or a third-party application software on a smart terminal or the like.

It should be noted that the data read request initiated by the application 102 to the cache 103 will be recorded by the cache management apparatus 104 . Specifically, information such as identification information and request time of the requested data of the data read request will be recorded. At the same time, the historical requested times of the data can also be updated.

When the data exists in the cache 103 , the cache 103 will return the data to the application 102 . At the same time, the cache management device 104 will record the data request result. That is, the data read request is responded to.

When the data does not exist in the cache 103 , the application 102 will initiate a data read request to the hard disk 105 . At the same time, the cache management device 104 will record the data request result. That is, the data read request is not responded to. Wherein, the hard disk 105 may include a solid-state hard disk, a conventional hard disk, and a hybrid hard disk.

When the data exists in the hard disk 105 , the hard disk 105 will return the data to the application 102 . At the same time, the cache management device 104 will decide whether to write the data into the cache 103 or not.

In this type of possible implementations, the request for writing data into the cache 103 is triggered by the user 101 clicking on the application 102 .

When the data does not exist in the hard disk 105, the system will return the information that the data cannot be found to the application 102.

In a possible implementation manner, the request for writing data into the cache 103 may be triggered by the application 102 . For example, it can be triggered by the refresh and warm-up of the content delivery network (CDN).

Specifically, for a uniform resource locator (URL) specified by the tenant to be refreshed or warmed up in the application 102, the application 102 will actively obtain updated data from the origin site. Therefore, when the user 101 accesses the CDN, there is no need to go back to the origin site of the tenant to obtain data. Therefore, when the refresh or warm-up action is triggered, the cache management device 104 will determine whether to write this part of the data into the cache 103 .

The following describes a cache writing method 200 provided by the present application. The cache writing method may run on the cache management device 104 .

The flowchart of the cache writing method 200 is shown in FIG. 2 . The cache writing method includes four parts: request information record, prediction model training, write judgment and cache data elimination.

First, the request information that the cache management device 104 may receive includes a data write request and a data read request. By recording parameters such as the request time and the number of requests for request information, data can be provided for the training of the predictive model. Specifically, the request information recording part includes steps S201 to S204.

S201: The cache management apparatus 104 receives a data read request.

As mentioned above, after the user 101 clicks on the application 102, the application 102 will initiate a data read request to the cache management apparatus 104. In other words, when a data read request comes, the cache management device 104 receives the data read request. The data read request indicates that the data in the cache 103 is returned to the application 102 .

S202: The cache management apparatus 104 records the request information of the data to be read.

When a data read request comes, the cache management device 104 will record the information of the data to be read and the request time. The information of the data may be an identifier (ID). Optionally, it may also be information such as a universally unique identifier (universally unique identifier, UUID) of the data. The following will take ID as an example to introduce.

According to the recorded time of each data read request, at least one of the following parameters can be calculated: the number of requested reads per cycle of each data and the historical average number of requested reads per cycle. The length of the period can be set as required. For example, if the period is 1 second, the requested reads per cycle for the data indicates the number of times that data was requested to be read in 1 second, and the historical average request reads per cycle for the data indicates the data The average number of requested reads per cycle over the historical count period. Among them, the historical count period can be set as required.

Optionally, in order to record the number of read requests per cycle of each data within a certain period of time, an 8-byte data group may be maintained for each data. Further, the 8 bytes are equally divided into 32 units. Among them, each unit is 2 bits (bits), and each unit number is 1, 2, "〃", 32. Each unit can represent four states (00, 01, 10, 11) of the number of read requests per cycle of the data, corresponding to no read request, one, two, three and more read requests respectively. In other words, by maintaining an 8-byte data group for each data, the number of read requests per cycle of the data in the past 32 cycles can be recorded. According to the above storage method, it is possible to store a larger number of requested reads per cycle in a smaller storage space.

Optionally, according to the recorded request time of each data read request, the total number of read requests in each cycle can also be obtained. That is, the total number of requested reads per cycle. Similarly, after recording the total number of requested readings in each cycle, you can choose to keep the total number of requested readings per cycle within a certain historical period. The length of the historical time can be set as required. Further, it is also possible to record the total number of read requests per cycle within a certain period of time by maintaining a data group with a size of several bytes. For a specific method, refer to the above-mentioned method for maintaining the data group corresponding to the number of reads requested per cycle.

It should be noted that S201 and S202 are not necessary steps. As mentioned above, when the data write request is not triggered by the user 101, but is triggered by a refresh or warm-up action of the application 102, S201 and S202 are not necessary.

S203: The cache management apparatus 104 receives the data writing request.

As mentioned above, there are two situations for the data write request: triggered by the user 101 and triggered by the application 102 . In both cases, the cache management device 104 will receive a data write request. The data write request indicates that the data requested to be written in the hard disk is written into the cache.

In a possible implementation manner, after the user 101 clicks on the application 102 , the application 102 will be triggered to initiate a data write request to the cache 103 . At the same time, the data write request initiated by the application 102 to the cache 103 will be recorded by the cache management apparatus 104 .

When the data requested to be written exists in the cache 103 , the cache 103 will return the data to the application 102 . At the same time, the cache management device 104 will record the data request result. That is, the data read request is responded to.

When the data requested to be written does not exist in the cache 103 , the application 102 will initiate a data read request to the hard disk 105 . At the same time, the cache management device 104 will record the data request result. That is, the data read request is not responded to.

When the data exists in the hard disk 105 , the hard disk 105 will return the requested data to the application 102 . At the same time, the cache management device 104 will decide whether to write the requested data into the cache 103 or not. That is, the cache management device 104 receives the write request of the data requested to be written.

In this type of possible implementations, the request for writing data into the cache 103 is triggered by the user 101 clicking on the application 102 . That is, it is triggered by steps S101 and S102.

In a possible implementation manner, the request for writing data into the cache 103 may be triggered by the application 102 . For example, it can be triggered by the refresh and warm-up of the content delivery network (CDN). Therefore, the cache management device 104 will receive a write request for the data.

In some possible implementations, the cache management apparatus 104 will receive a batch of data write requests in the process of accumulating a certain amount of training data. Among them, a certain number can be the sampling threshold, and can also be set as required.

For example, the first batch of data write requests indicates a set of a batch of data write requests received by the cache management apparatus 104 within a certain period of time. In step S205, the first data writing request and the second data writing request in the first batch of data writing requests may be determined according to the sampling rule. Wherein, the first data writing request or the second data writing request may include one or more data writing requests. Meanwhile, the first data mentioned below indicates the data requested to be written in the first data write request. Similarly, the second data indicates the data requested to be written in the second data write request.

The following will take the first batch of data write requests as an example for introduction.

S204: The cache management apparatus 104 records the request information of the data to be written.

When the first batch of data writing request comes, record the ID and request time of each data in the first batch of data. Further, at least one of the following parameters can be obtained: the number of requested writes per cycle of each data, the historical average number of requested writes per cycle, and the total number of requested writes per cycle. The length of the period can be set as required. For example, if the period is 1 second, the requested writes per cycle of data indicates the number of times each data is requested to be written in 1 second, and the historical average The average number of writes requested per cycle over the historical count period.

According to the recorded request time of each data write request, the total number of write requests per cycle can be obtained. The indicator indicates the total number of write requests for each data received by the cache management apparatus 104 within one second.

Further, after the number of requested writes per cycle of each data is recorded, the number of requested writes per cycle within a certain period of time may be selected to be retained. In a large-scale data system, the data base is large, and the number of writes requested per cycle for a long period of time when the aforementioned data is stored will occupy a large storage space. Therefore, in order to reduce the storage space, methods such as optimizing data storage and shortening the length of the retention time period can be adopted.

Taking optimizing data storage as an example, an 8-byte data group can be maintained for each data. Further, the 8 bytes are equally divided into 32 units. Among them, each unit is 2 bits (bits), and each unit number is 1, 2, "〃", 32. Each unit can represent four states (00, 01, 10, 11) of the number of writes requested per cycle of the data, corresponding to no write request, one, two, three and more write requests, respectively. In other words, by maintaining an 8-byte data group for each data, the number of write requests per cycle of the data in the past 32 cycles can be recorded. According to the above storage method, it is possible to store a larger number of requested writes per cycle with a smaller storage space.

According to the recorded request time of each data write request, the total number of write requests in each cycle can be obtained. That is, the total number of requested writes per cycle. Similarly, after recording the total number of write requests per cycle, you can choose to keep the total number of write requests per cycle within a certain period of time. Further, it is also possible to record the total number of requested writes per cycle within a certain period of time by maintaining a data group with a size of several bytes. For a specific method for maintaining a data group, refer to the above-mentioned method for maintaining a data group corresponding to the number of writes requested per cycle.

It should be noted that the read request and the write request received by the cache management apparatus 104 may be independent of each other. Therefore, the execution time of these two steps (steps S203 and S204 ) has no sequence with the execution time of the above-mentioned steps S201 and S202 . In other words, steps S203 and S204 may be performed before or after steps S201 and S202. Optionally, steps S203 and S204 may also be performed simultaneously with steps S201 and S202.

After receiving the data read request and the data write request in steps S201 to S204, and recording the related information, the cache write prediction model can be trained by using part of the data and the related information. Specifically, the prediction model training part includes steps S205 to S207.

S205: The cache management apparatus 104 determines whether the data to be written satisfies the sampling rule.

According to the identifiers of the data to be written obtained in step S204, it can be determined whether the data to be written satisfies the sampling rule. Specifically, the first data write request and the second data write request in the first batch of data write requests may be determined according to the sampling rule. That is, the first data to be written and the second data to be written in the first batch of data to be written can be determined.

The first data to be written indicates the data to be written that satisfies the sampling rule, and the second data to be written indicates the data to be written that does not meet the sampling rule. When the first data to be written is written into the cache, the data is also used for training the writing prediction model. For the second data to be written, go to step S209 for processing.

Optionally, in a large-scale data system, the data base is large, and it is difficult to use each piece of data as a training sample for the prediction model. Therefore, a part of the data needs to be sampled as training samples. There are at least two factors that affect sampling: sampling rate and sampling rules.

Among them, the sampling rate indicates the probability that each piece of data is sampled on average, and the value is usually between 0 and 1. The lower the sample rate, the slower the samples are collected.

Sampling rules can be determined based on the identity of the data in the system. For example, the sampling rule can be determined according to identifiers such as ID, URL, or UUID. This application uses ID as an example to establish a sampling rule. Generally speaking, IDs can be strings or numbers. Therefore, after the operation result is obtained by using the ID of the data, whether to sample the data can be determined according to the operation result and the sampling value. Wherein, the operation method may be a method such as a neural network or a hash algorithm. The embodiment of the present application does not limit the calculation method. The sampling value can be set as required. For example, the sample value can be made equal to the sample rate.

Specifically, the sampling value is used to take the remainder of the hash operation result. For the data ID whose remainder is 0, the data to be written is regarded as the first data to be written. That is, training data. Conversely, if the remainder of the sampling value to the hash operation result is not 0, the data will not be sampled. That is, go to step S209.

S206: The cache management apparatus 104 marks the first data to be written that satisfies the sampling rule.

According to the access situation and the elimination situation of the first data to be written, the cache management apparatus 104 may mark the first data to be written.

In a possible implementation manner, the first data to be written may be marked according to the request situation of the first data to be written within a period of time after the judgment in S205. That is, according to the number of times that the first data to be written is requested to be written and the number of times to be read within an average cache elimination cycle, the labeling situation of the first data to be written can be determined.

The length of the period of time can be set as required. The following will take the average cache elimination period as an example for introduction, and how to obtain the average cache elimination period will be introduced in detail in step S213.

In this type of possible implementations, when the first data to be written satisfies at least one of the following conditions within an average elimination cycle: being requested to write at least once and being requested to read at least once, the first data to be written is at least one of the following conditions: A data to be written is marked as hot data.

On the contrary, when the first data to be written satisfies both of the following two conditions within an average cache elimination cycle: no writing is requested and no reading is requested, the first data to be written is marked as cold data.

In a possible implementation manner, the training data may also be marked according to the elimination situation of the first data to be written in an average elimination cycle in the future.

As described above, if the data to be written meets the sampling rule in S205, the first data to be written is also directly written into the cache while the first data to be written is marked. Since the storage space of the cache is limited, the stored data needs to be eliminated according to the elimination rule. The specific elimination rules will be introduced in detail in S212.

When the first data to be written is within an average cache elimination cycle and before being eliminated by the cache, neither writing nor reading is requested. After the data is eliminated from the cache, the first data to be written is marked as cold data.

It should be noted that after the first data to be written is marked, the marked state of the data is not modified. After completing the labeling of the data to be written, it may go to step S207 for training the prediction model.

S207: The cache management device 104 trains the prediction model according to the record information and the marked first data to be written.

After marking the first data to be written in S206, according to the marking information of each first data to be written and at least one of the following parameters: the number of writes requested per cycle, the total number of writes requested per cycle, The average size of cached data, the average cache eviction period, the number of requested reads per cycle, and the total number of requested reads per cycle can be used to train the predictive model.

Specifically, according to the total requested writes per cycle obtained in S202, the average total requested writes per cycle can be obtained. As mentioned above, in S202, a data group with a size of several bytes is maintained, and the total number of requested writes per cycle within a certain period of time is recorded. Further, the average total requested write number per cycle can be obtained by calculating the average value of the data group.

Optionally, for the total number of read requests per cycle obtained in S204, the average total number of read requests per cycle can be obtained. Specifically, as mentioned above, a data group of several bytes in size is maintained in S204, and the total number of read requests per cycle within a certain period of time is recorded. Further, the average total number of read requests per cycle can be obtained by calculating the average value of the data group.

The average cache data size indicates the average size of the data stored in the cache medium. The specific obtaining method will be described in detail in step S211.

The average cache eviction period indicates the average value of the time period from the entry of the cache to the eviction of each data in the cache medium. The specific obtaining method will be described in detail in step S213.

The average size of cache data, the average cache elimination cycle, the average number of total requested writes per cycle, the number of requested writes per cycle of each data to be written, the average total number of reads per cycle, and the average number of writes per cycle of each data to be written At least one item of the requested read numbers is used as input, and the label information of each data to be written is used as output, and the prediction model can be trained.

Among them, the prediction model can be an artificial intelligence model such as a back propagation neural network model or a long short-term memory network. It should be noted that the embodiments of the present application do not limit the method for establishing the prediction model, and since it is the prior art, details are not described again.

In some possible implementations, when the amount of data that has been labelled and has not been used for training reaches or exceeds the sampling threshold in S206, the training of the prediction model may be started. The sampling threshold can be set as required. For example, it can be set to start a round of training for the prediction model when the number of labeled training data reaches 10,000.

In some possible implementations, the training rounds of the prediction model can be set. Based on the number of epochs that the predictive model has been trained on and the epoch threshold, it can be determined whether further training of the model is required. When the training epoch reaches or exceeds the epoch threshold, the predictive model can no longer be trained.

For example, when the round threshold is set to 100, after 100 rounds of training on the prediction model, operations such as sampling and labeling of the data to be written can no longer be performed. Further, the predictive model is no longer trained.

In some possible implementations, after the prediction model is trained by using the first data to be written, the first data to be written stored in the prediction model may be deleted.

In some possible implementations, the output amount, that is, the label information of each first data to be written, may be preprocessed. Specifically, the cold/hot data in the marked state can be converted into 0/1 or 1/0. In this class of possible implementations, the output of the predictive model can be controlled between 0 and 1.

After sampling the data to be written and training the prediction model in steps S205 to S207, a trained prediction model will be obtained. For the data that does not satisfy the sampling rule in S205, its writing probability may be determined based on the prediction model. The prediction model may be the model after training in step S207 or the model before training in step S207. Specifically, the model used for prediction needs to go through at least one round of training.

The writing judgment section includes steps S208 to S211.

S208: The cache management apparatus 104 generates a prediction model and uses it to predict the writing probability of the data to be predicted.

After the prediction model is trained in S207, a prediction model that has completed at least one round of training can be obtained to predict the writing state of the data to be predicted. Wherein, the to-be-predicted data indicates the second to-be-written data determined in step S209 that does not satisfy the special rule. A specific method for obtaining the second data to be written that does not meet the special rules will be introduced in step S209.

It should be noted that the prediction model used to predict the writing probability of the second data to be written in this step may be different from the prediction model trained in step S207. In some possible implementations, the cache write prediction model used in step S208 is the cache write prediction model before training in step S207. That is, the cache write prediction model trained by the data corresponding to the previous batch of data write requests. In other words, the cache write prediction model trained by the first data to be written in the first batch of data can be used to predict part of the data in the second batch of data corresponding to the second batch of data write requests. Wherein, the occurrence time of the second batch of data write requests should be later than the occurrence time of the first batch of data write requests.

The number of requested writes per cycle of the data to be predicted and the average size of cached data, the average cache elimination cycle, the average total requested writes per cycle, the average total requested reads per cycle, and the second to-be-written data per cycle At least one of the requested read counts is used as input, and the write status of the data to be predicted can be predicted.

After using the prediction model to predict the writing state of the data to be predicted, an output value between 0 and 1 can be obtained. That is, the writing probability of the data to be predicted.

S209: The cache management apparatus 104 determines whether the second data to be written that does not satisfy the sampling rule satisfies the special rule.

After judgment in S205, the second data to be written that does not satisfy the sampling rule will be transferred to S209 for processing. In S209, it will be further judged whether the data satisfies the special rules.

Special rules include the write of partial data and the exclusion of partial data. For example, according to the requirements of the tenant, special types of files are not written, and files of special domain names are not written.

For the second to-be-written data that satisfies the special rule, write it into the cache. That is, go to step S211.

For the second data to be written that does not satisfy the special rule, go to step S210.

It should be noted that step S209 is an optional step in the cache writing method 200 .

S210: The cache management device 104 determines the write status of the second to-be-written data that does not meet the special rules and the write probability and write threshold.

According to the write threshold and the write probability of the second data to be written that does not satisfy the sampling rule, the write state of the data can be determined. The write threshold can be set as required.

The writing probability of the second to-be-written data that does not satisfy the sampling rule may be obtained by using the prediction model in S208. That is, the second to-be-written data that does not satisfy the sampling rule is used as the to-be-predicted data in the prediction model. Specifically, the number of requested writes per cycle of the data and the average size of cached data, the average cache elimination cycle, the average total number of requested writes per cycle, the average total number of requested reads per cycle, and the average number of requested reads per cycle of the data At least one of the numbers is used as an input to the prediction model, which can predict the writing state of the data.

In some possible implementations, an output value between 0 and 1 can be obtained after using the prediction model to predict the writing state of the data. That is, the writing probability of the data.

Further, according to the write probability and the write threshold, the write state of the second data to be written that does not satisfy the sampling rule can be determined.

Specifically, when the write probability is greater than or equal to the write threshold, the data will be written to the cache. When the write probability is less than the write threshold, the data will not be written to the cache.

In some possible implementations, the predictive model may have write thresholds built into the predictive model. That is, the prediction model can directly output the judgment result of whether to write the second data to be written into the cache.

It should be noted that the premise of using the prediction model to predict the writing probability is that at least one round of training has been performed on the prediction model in step S207. In other words, the prediction model is not trained until the accumulated number of data to be written in step S206 does not reach the sampling threshold. Further, the writing state of the second to-be-written data that does not satisfy the sampling rule can be determined according to the prior art. Among them, the prior art includes writing rules based on specific rules or frequency statistics, etc., which will not be repeated.

S211: The cache management device 104 writes the determined to-be-written data into the cache, and counts the size of the written data.

In step S211, it is determined that the data to be written to be written includes at least the following three types of situations:

In some possible implementations, it is determined that the data to be written may be the data to be written that satisfies the sampling rule in step S205. That is, the ID of the data takes the remainder of the sample value to be 0 for the first data to be written.

In some possible implementations, it is determined that the data to be written may also be the second data to be written that satisfies the special rule in step S209. Wherein, the second data to be written indicates the data to be written in which the remainder of the ID of the data and the sampling value is not 0.

In some possible implementations, it is determined that the data to be written may also be the second data to be written that does not meet the special rules and whose writing probability is not less than the writing threshold in step S210 .

Optionally, when the data to be written is written into the cache, the size of each data to be written needs to be recorded, so as to calculate the average size of the data in the cache medium at the current moment. Further, the average size of the data can be used as an input in step S207 for training the prediction model.

After writing judgment on the second data to be written, part of the second data to be written will be written into the cache. Because the storage space of the cache is limited, some data needs to be retired periodically. Specifically, the cache data elimination part includes steps S212 to S214.

S212: Determine whether to eliminate some data in the cache according to the elimination rule.

Since the data storage capacity of the cache medium is limited, it is necessary to continuously eliminate data to free up storage space to cache newly written data. When the amount of data written exceeds the capacity, a part of the data can be selected to be eliminated. The specific elimination rules belong to the prior art and will not be repeated here. Common cache data elimination algorithms include least recently used (LRU) and first-in, first-out (FIFO) methods.

Do not operate on write data that is determined not to be eliminated. For write data that is determined to be eliminated, move the data out of the cache.

S213: The cache management device 104 updates the labeling state of the eliminated data, and calculates the average cache elimination period.

As mentioned above, the data to be written into the cache in step S211 includes at least three cases. Wherein, in some possible implementation manners, the data to be written may be the first data to be written that satisfies the sampling rule in step S205. That is, the first data to be written will simultaneously wait for the mark in step S206.

The first data to be written may be marked according to the elimination of the first data to be written in the cache. Specifically, for the eliminated data, if the data belongs to the first data to be written and the marking has not been completed in step S206, the first data to be written is marked as cold data. Further, the first to-be-written data marked in step S206 is transferred to step S207 and used as training data for the prediction model.

In addition, according to the elimination time of each eliminated data and the writing time of the data, the average cache elimination cycle can be obtained. The writing time of the data is the requested writing time of the data recorded in step S202. Specifically, the elimination period of each eliminated data can be obtained by subtracting its write time from the elimination time of each eliminated data. Further, by calculating the average value of the elimination period of each eliminated data, the average cache elimination period can be determined. Wherein, each item of eliminated data may be all data that has been eliminated in history. Optionally, the eliminated data may also be eliminated data within a period of time in the past. Further, the average cache elimination period can be used as an input in step S207 for training the prediction model.

S214: The cache management apparatus 104 outputs the eliminated data.

The data determined to be eliminated in step S212 are output.

It should be noted that, steps S213 and S214 have no fixed execution order. That is, step S213 may be performed before or after step S214. Optionally, step S213 and step S214 may also be performed simultaneously.

The present application also provides a cache management device 104, as shown in FIG. 3, including:

The communication unit 302 is configured to receive a data read request in S201 and a data write request in S203. The communication unit 302 is further configured to receive the set sampling threshold in S207. In S210, the communication unit 302 is used to obtain the write threshold. The communication unit 302 is further configured to receive the special rules set by the tenant in S209.

The storage unit 304 is configured to store the request information of the data to be read recorded in S202 and the request information of the data to be written recorded in S204. In the cache management method 200, the storage unit 304 is configured to store the relevant information of the first data to be written determined in S205. The storage unit 304 is further configured to store the parameters in the prediction model trained in S207. The sampling threshold received in S207 and the write threshold received in S210 will also be stored in the storage unit 304 . The storage unit 304 is also used to store the special rules set by the tenant in S209.

The cache unit 306, in the cache management method 200, is configured to cache the first data to be written determined in S205. In S209 , the second to-be-written data determined to be written to the cache and the second to-be-written data whose write probability is greater than the write threshold in S210 will both be cached in the cache unit 306 .

The processing unit 308 is configured to perform the recording operations in S202 and S204, and store the request information of the data to be read and the data to be written into the storage unit 304. The processing unit 308 is further configured to judge the current data to be written in S205, and determine the first data to be written and the second data to be written. In the cache management method 200, the processing unit 308 is configured to mark the first data to be written. Further, the operation of training the prediction model according to the recorded request information and annotation information in S207 is also performed by the processing unit 308 . In S208, the processing unit 308 is configured to perform an operation of predicting the writing probability of the data to be predicted. The processing unit 308 is further configured to determine in S209 whether the second data to be written satisfies the special rule. In S210, the operation of determining whether to write the second data to be written into the cache according to the write probability and the write threshold obtained in S208 is also performed by the processing unit 308. In S211 , the operation of determining the to-be-written data written into the cache into the cache, and calculating the size of the data in the cache is also performed by the processing unit 308 . The processing unit 308 is further configured to output the partial data in the cache according to the elimination rule in S212. In S213, the labeling state of the eliminated data is updated according to the elimination situation of the data, and the operation of calculating the average elimination period is performed by the processing unit 308.

Specifically, the processing unit 308 may include a recording unit 310 , a training unit 312 , a decision unit 314 and an elimination unit 316 .

The recording unit 310 is configured to perform the recording operations in S202 and S204 , and store the data to be read and the request information of the data to be written into the storage unit 304 . The decision unit 314 is configured to judge the current data to be written in S205, and determine the first data to be written and the second data to be written. In the cache management method 200, after determining the first data to be written, the training unit 312 is configured to mark the first data to be written. Further, the operation of training the prediction model according to the recorded request information and annotation information in S207 is also performed by the training unit 312 .

In S208, the decision unit 314 is configured to perform an operation of predicting the writing probability of the data to be predicted. The decision unit 314 is further configured to determine in S209 whether the second data to be written satisfies the special rule. In S210 , the operation of determining whether to write the second data to be written into the cache according to the write probability and the write threshold obtained in S208 is also performed by the decision unit 314 . In S211 , the operation of determining the to-be-written data to be written into the cache and writing the data into the cache and counting the size of the data in the cache is also performed by the decision unit 314 . The elimination unit 316 is configured to output the partial data in the cache according to the elimination rule in S212. In S213, the labeling state of the eliminated data is updated according to the elimination situation of the data, and the operation of calculating the average elimination period is performed by the elimination unit 316.

The present application also provides a computing device 400 . As shown in FIG. 4 , the computing device includes a bus 402 , a processor 404 , a memory 406 and a communication interface 408 . Communication between processor 404 , memory 406 and communication interface 408 is via bus 402 . Computing device 400 may be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 400 .

The bus 402 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one line is shown in FIG. 4, but it does not mean that there is only one bus or one type of bus. Bus 404 may include pathways for communicating information between various components of computing device 400 (eg, memory 406, processor 404, communication interface 408).

The processor 404 may include processing such as a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP). any one or more of the devices.

Memory 406 may include volatile memory, such as random access memory (RAM). The processor 404 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid state hard disk (solid state) drive, SSD). The memory 406 stores executable program codes, and the processor 404 executes the executable program codes to implement the aforementioned cache management method 200 . Specifically, the memory 406 stores instructions for the cache management apparatus 104 to execute the cache management method 200 .

The communication interface 403 uses a transceiver module such as, but not limited to, a network interface card, a transceiver, etc., to implement communication between the computing device 400 and other devices or a communication network.

Embodiments of the present application further provide a computing device cluster. As shown in FIG. 5 , the computing device cluster includes at least one computing device 400 . The computing device clusters included in the computing device cluster may all be terminal devices, may all be cloud servers, or may be partly cloud servers and partly terminal devices.

In the above-mentioned three deployment manners about the computing device cluster, the memory 406 in one or more computing devices 400 in the computing device cluster may store the same cache management apparatus 104 for executing the instructions of the cache management method 200 .

In some possible implementations, one or more computing devices 400 in the computing device cluster may also be used to execute part of the instructions of the cache management apparatus 104 for executing the cache management method 200 . In other words, a combination of one or more computing devices 400 may collectively execute the instructions of the cache management apparatus 104 for executing the cache management method 200 .

It should be noted that, the memories 406 in different computing devices 400 in the computing device cluster may store different instructions for executing some functions of the cache management apparatus 104 .

Figure 6 shows one possible implementation. As shown in FIG. 6 , two computing devices 400A and 400B are connected through a communication interface 408 . Instructions for performing the functions of the communication unit 302 , the storage unit 304 , the recording unit 308 , the training unit 310 , the decision unit 312 , and the elimination unit 314 are stored on memory in the computing device 400A. Instructions for performing the functions of cache unit 306 are stored on memory in computing device 400B. In other words, the memories 406 of the computing devices 400A and 400B collectively store the instructions for the cache management apparatus 104 to execute the cache management method 200 .

The connection mode between the computing device clusters shown in FIG. 6 may take into account that the cache management method 200 provided by the present application needs to perform high-speed writing or reading operations on the data in the cache unit 306 . Therefore, consider offloading the caching function to computing device 400B.

It should be understood that the functions of the computing device 400A shown in FIG. 6 may also be performed by multiple computing devices 400 . Likewise, the functions of computing device 400B may also be performed by multiple computing devices 400 .

In some possible implementations, one or more computing devices in a cluster of computing devices may be connected by a network. Wherein, the network may be a wide area network or a local area network, or the like. Figure 7 shows one possible implementation. As shown in FIG. 7 , two computing devices 400C and 400D are connected through a network. Specifically, the network is connected through a communication interface in each computing device. In this type of possible implementation, the memory 406 in the computing device 400C stores instructions for executing the communication unit 302 , the storage unit 304 , the recording unit 308 , the decision unit 312 and the elimination unit 314 . Meanwhile, the memory 406 in the computing device 400D stores instructions for executing the cache unit 306 and the training unit 310 .

The connection mode between the computing device clusters shown in FIG. 7 may take into account that the cache management method 200 provided by the present application needs to perform high-speed write or read operations on the data in the cache unit 306, and perform a large number of calculations to To train the prediction model, it is therefore considered that the functions implemented by the caching unit 306 and the training unit 310 are performed by the computing device 400D.

It should be understood that the functions of the computing device 400C shown in FIG. 7 may also be performed by multiple computing devices 400 . Likewise, the functions of computing device 400D may also be performed by multiple computing devices 400 .

Embodiments of the present application also provide a computer-readable storage medium. The computer-readable storage medium may be any available medium that a computing device can store, or a data storage device such as a data center that contains one or more available media. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state drives), and the like. The computer-readable storage medium includes instructions, the instructions instruct the computing device to execute the above-mentioned application to the cache management apparatus 104 for executing the cache management method 200 .

Embodiments of the present application also provide a computer program product including instructions. The computer program product may be a software or program product containing instructions, capable of being executed on a computing device or stored in any available medium. When the computer program product runs on at least one computer device, the at least one computer device is caused to execute the above-mentioned cache management method 200 .

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be Modifications are made to the technical solutions described in the foregoing embodiments, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions depart from the protection scope of the technical solutions of the embodiments of the present invention.

Claims

A cache management method, characterized in that the cache management method is used in a cache management device, and the method includes:

receiving a first data write request, where the first data write request is used to request to write the first data in the hard disk into the cache;

According to the relevant parameters of the first data, the training cache writes the prediction model;

writing the first data into the cache;

receiving a second data write request, where the second data write request is used to request to write the second data in the hard disk into the cache;

According to the cache write prediction model, it is determined whether to write the second data into the cache.
The method of claim 1, wherein the first data write request and the second data write request belong to a first batch of data write requests, the method further comprising:

receiving the first batch of data write requests;

According to a sampling rule, the first data write request and the second data write request are determined from the first batch of data write requests.
The method of claim 2, wherein determining the first data write request and the second data write request from the first batch of data write requests according to a sampling rule, comprising:

Obtain the identifier of the data to be written carried by each data write request in the first batch of data write requests;

It is determined that the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests can be divisible by the sampling value to be the first data write request;

It is determined that the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests cannot be divisible by the sampling value, which is the second data write request.
The method according to any one of claims 1 to 3, wherein the method further comprises:

Receive the second batch of data write requests;

According to the trained cache write prediction model, it is determined whether to write at least one data in the data to be written corresponding to the second batch of data write requests into the cache.
The method according to any one of claims 1 to 4, wherein the training of the cache write prediction model according to the relevant parameters of the first data comprises:

Take at least one of the average size of data in the cache, the total number of write requests per cycle in history, the total number of read requests per cycle in history, the average elimination cycle of data in the cache, and the relevant parameters of the first data. Describe the amount of training input that the cache writes to the prediction model;

Determine the write probability of the first data as the training output of the cache write prediction model according to the request situation and the elimination situation of the first data in the average elimination period after the first data request occurs :

According to the training input amount and the training output amount, the training cache is written to the prediction model.
The method according to any one of claims 1 to 5, wherein the relevant parameters of the first data include at least one of the following:

The number of historically requested writes of the first data and the number of historically requested reads of the first data.
The method according to any one of claims 1 to 6, wherein the determining whether to write the second data into the cache according to the cache write prediction model comprises:

Get the write threshold;

obtaining a write probability of the second data according to the cache write prediction model;

Whether to write the second data into the cache is determined according to the write threshold and the write probability of the second data.
The method of claim 7, wherein the obtaining the write probability of the second data according to the trained cache write prediction model comprises:

Take at least one of the average size of the data in the cache, the average elimination period of the data in the cache, the total number of write requests per cycle in history, the total number of read requests per cycle in history, and the second data-related parameters as all parameters. Describe the amount of prediction input that the cache writes to the prediction model;

The predicted input amount is used as the input of the trained cache write prediction model, and the write probability of the second data is obtained.
A cache management device, characterized in that the device includes a communication unit and a processing unit:

the communication unit, configured to receive a first data write request, where the first data write request is used to request to write the first data in the hard disk into the cache;

the processing unit, configured to train a cache write prediction model according to the relevant parameters of the first data; write the first data into the cache;

The communication unit is further configured to receive a second data write request, where the second data write request is used to request to write the second data in the hard disk into the cache;

The processing unit is further configured to determine whether to write the second data into the cache according to the cache write prediction model.
The apparatus according to claim 9, wherein the communication unit is configured to receive a first batch of data write requests; the processing unit is configured to write requests from the first batch of data according to a division rule The first data write request and the second data write request are determined in .
The apparatus according to claim 10, wherein the processing unit is configured to obtain the identifier of the data to be written carried by each data write request in the first batch of data write requests; In the batch data write request, the hash value of the identifier of the data to be written carried by each data write request can be divisible by the sampling value is the first data write request; it is determined that the first batch of data write requests contains It is the second data write request that the hash value of the identifier of the data to be written carried by each data write request cannot be divisible by the sampling value.
The apparatus according to any one of claims 8 to 11, wherein the processing unit is configured to receive a second batch of data write requests; and determine whether to write the data according to the trained cache write prediction model At least one data in the data to be written corresponding to the second batch of data write requests is written into the cache.
The apparatus according to any one of claims 8 to 12, wherein the processing unit is configured to compare the average size of data in the cache, the historical total number of write requests per cycle, and the historical total read requests per cycle number, at least one of the average elimination cycle of data in the cache, and the relevant parameters of the first data as the training input of the cache write prediction model; according to the first data request after the occurrence of the average elimination The request situation and elimination situation of the first data in the cycle, determine the write probability of the first data as the training output of the cache write prediction model: according to the training input and the training output, The training cache writes to the predictive model.
The apparatus according to any one of claims 8 to 13, wherein the relevant parameters of the first data include at least one of the following: the historical requested writing number of the first data, the The number of historical reads requested.
The apparatus according to any one of claims 8 to 14, wherein the communication unit is configured to obtain a write threshold; the processing unit is configured to obtain the first write threshold according to the cache write prediction model. Write probability of the second data; determine whether to write the second data into the cache according to the write threshold and the write probability of the second data.
The apparatus according to claim 15, wherein the processing unit is configured to calculate the average size of the data in the cache, the average elimination cycle of the data in the cache, the total number of historical write requests per cycle, and the historical per cycle At least one of the total number of read requests and the second data-related parameter are used as the predicted input of the cache write prediction model; the predicted input is used as the input of the trained cache write prediction model , to obtain the write probability of the second data.
A computing device cluster, comprising at least one computing device, each computing device including a processor and a memory;

The processor of the at least one computing device is adapted to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform the method of any one of claims 1-8.
A computer program product comprising instructions, wherein the instructions, when executed by a cluster of computer devices, cause the cluster of computer devices to perform the method according to any one of claims 1 to 8.
A computer-readable storage medium, comprising computer program instructions, when the computer program instructions are executed by a computing device cluster, the computing device cluster executes the method according to any one of claims 1 to 8.