CN113297106A - Data replacement method based on hybrid storage, related method, device and system - Google Patents

Data replacement method based on hybrid storage, related method, device and system Download PDF

Info

Publication number
CN113297106A
CN113297106A CN202010294232.8A CN202010294232A CN113297106A CN 113297106 A CN113297106 A CN 113297106A CN 202010294232 A CN202010294232 A CN 202010294232A CN 113297106 A CN113297106 A CN 113297106A
Authority
CN
China
Prior art keywords
data
storage
storage space
cache
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010294232.8A
Other languages
Chinese (zh)
Inventor
赵钊
张友东
夏德军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010294232.8A priority Critical patent/CN113297106A/en
Publication of CN113297106A publication Critical patent/CN113297106A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms

Abstract

The invention discloses a data replacement method based on hybrid storage, a related method, a related device and a related system. The data replacement method based on the hybrid storage comprises the following steps: checking whether the data size of the cache data in the first storage space exceeds a preset threshold value; if so, determining and deleting the data to be eliminated in the cache data according to the cooling degree and the occupied space of each data in the cache data. Compared with the prior art, the data replacement method based on hybrid storage provided by the embodiment of the invention has the advantages that the hit rate of memory data of the storage space is improved, the storage space is expanded, the storage cost of data is saved, the data reading and writing speed and efficiency of the storage space are improved, and the high access performance of the storage space can be better exerted.

Description

Data replacement method based on hybrid storage, related method, device and system
Technical Field
The present invention relates to the field of data processing, and in particular, to a data replacement method based on hybrid storage, and a related method, device and system.
Background
At present, although the internal memory in a computer system has high access performance, the storage capacity is small, the manufacturing and use cost is high, and the storage requirement of a large amount of data cannot be met, so that a mixed storage mode is generally adopted in the prior art, part of frequently used data is cached in the internal memory, and the external memory is used for storing the whole amount of data. Therefore, the access performance is ensured, the storage capacity of the computer system is greatly expanded, and the storage cost of data is greatly reduced. When accessing data, if the data is already directly obtained from the internal memory, if the accessed data is not in the internal memory, the data needs to be loaded from the external memory to the internal memory. In order to ensure the access performance of the internal memory, some data needs to be selected and eliminated timely. In the prior art, data is generally eliminated according to the cooling degree of cache data in an internal memory, but this way often causes that some data with large occupied space cannot be eliminated by caching in the internal memory for a long time, causes that the internal memory occupies a large amount all the time, causes that some small data cannot be cached, causes that the hit rate of access data of the internal memory is low, needs to frequently read data from an external memory, and cannot fully exert the speed and efficiency advantages.
Disclosure of Invention
In view of the above, the present invention has been developed to provide a data replacement method based on hybrid storage, a related method and apparatus and system that overcome or at least partially solve the above-mentioned problems.
As a first aspect of the embodiments of the present invention, an embodiment of the present invention provides a data replacement method based on hybrid storage, including:
checking whether the data size of the cache data in the first storage space exceeds a preset threshold value;
if so, determining and deleting the data to be eliminated in the cache data according to the cooling degree and the occupied space of each data in the cache data.
In one or some possible embodiments, the cooling degree of the data is determined by at least one of the data idle time length, the frequency of data access and the number of data access, wherein the cooling degree of the data is positively correlated with the data idle time length and negatively correlated with the frequency of data access and the number of data access;
the determining and deleting the data to be eliminated in the cache data according to the cooling degree and the occupied space of each data in the cache data comprises the following steps:
determining the cooling degree and the occupied space of data in the cache data;
determining the elimination coefficient of the data in the cache data according to the cooling degree and the occupied space of the data;
and determining and deleting the data to be eliminated according to the sequence of eliminating coefficients of the data in the cache data from large to small.
In one or some possible embodiments, the elimination factor of the data is determined by one or a combination of the following:
determining the product of the cooling degree and the occupied space size of the data to obtain the elimination coefficient of the data;
or, taking logarithm of the size of the occupied space of the data, and determining the product of the logarithm of the size of the occupied space of the data and the cooling degree of the data to obtain the elimination coefficient of the data;
or weighting the cooling degree and the occupied space of the data according to preset weight to obtain the elimination coefficient of the data.
In one or some possible embodiments, the first storage space comprises a multi-level storage area; after determining the elimination coefficient of the data in the cached data according to the cooling degree and the occupied space of the data, the method further comprises the following steps:
and respectively adjusting the data in the cache data to the storage areas corresponding to the preset multi-level elimination coefficient threshold range for storage according to the elimination coefficient of the cache data and the preset multi-level elimination coefficient threshold range from large to small.
In one or some possible embodiments, when the size of the data amount of the cached data in the first storage space exceeds a preset threshold, determining the storage areas of the data to be eliminated according to the descending order of the elimination coefficient threshold range, and deleting the data in the corresponding storage areas.
In one or some possible embodiments, the first storage space is an internal memory.
As a second aspect of the embodiments of the present invention, an embodiment of the present invention provides a data storage method based on hybrid storage, including: writing and caching data in a first storage space;
writing the data cached in the first storage space into a second storage space;
and replacing the cached data in the first storage space by adopting the cache data replacement method.
In one or some possible embodiments, the method further comprises: and when the first storage space receives the written data, writing the data into the second storage space through the asynchronous application.
As a third aspect of the embodiments of the present invention, an embodiment of the present invention provides a data access method based on hybrid storage, including:
receiving a data access request sent by a data request end, and inquiring whether data corresponding to the request exists in cache data of a first storage space;
if not, reading the data to be accessed from the full data of the second storage space, writing the data into the cache data of the first storage space, and returning the data to the data request end;
and replacing the cached data in the first storage space by adopting the data replacement method based on the hybrid storage.
As a fourth aspect of the embodiments of the present invention, an embodiment of the present invention provides a data replacement apparatus based on hybrid storage, including:
the checking module is used for checking the data size of the cache data in the first storage space;
the judging module is used for judging whether the data size of the cache data exceeds a preset threshold value or not;
the determining module is used for determining to-be-eliminated data in the cache data according to the cooling degree and the occupied space of each data in the cache data when the judging module determines that the data size of the cache data exceeds a preset threshold;
and the deleting module is used for deleting the data to be eliminated.
As a fifth aspect of the embodiments of the present invention, an embodiment of the present invention provides a data hybrid storage system, including a data writing device and the above-mentioned data replacement device based on hybrid storage;
the data writing device is used for writing and caching data in a first storage space; and writing the cache data in the first storage space into the second storage space.
As a sixth aspect of the embodiments of the present invention, an embodiment of the present invention provides a hybrid storage data processing system, including: the data reading device and the data replacement device based on the hybrid storage;
the data reading device is used for receiving a data access request sent by a client and inquiring data to be accessed from cache data of the first storage space; when the cache data of the first storage space does not comprise the data to be accessed, acquiring the data to be accessed from the full data of the second storage space and writing the data to be accessed into the cache data of the first storage space; and returning the data to be accessed to the client.
As a seventh aspect of the embodiments of the present invention, an embodiment of the present invention provides a data hybrid storage and processing system, including a first storage space, a second storage space, and the above data replacement device based on hybrid storage;
the first storage space is used for storing cache data; the second storage space is used for storing full data.
As an eighth aspect of the embodiments of the present invention, an embodiment of the present invention provides an intelligent device, including: the storage device comprises a memory, a processor and computer instructions stored on the memory and capable of running on the processor, wherein the instructions can realize the data replacement method based on the hybrid storage when being executed by the processor, or the instructions can realize the data storage method based on the hybrid storage when being executed by the processor, or the instructions can realize the data processing method based on the hybrid storage when being executed by the processor.
As a ninth aspect of the embodiments of the present invention, the embodiments of the present invention provide a computer readable storage medium, on which computer instructions are stored, which when executed by a processor can implement the above-mentioned data replacement method based on hybrid storage, or which when executed by a processor can implement the above-mentioned data storage method based on hybrid storage, or which when executed by a processor can implement the above-mentioned data processing method based on hybrid storage.
The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:
according to the data replacement method based on hybrid storage provided by the embodiment of the invention, the data with low heat and large occupied space in the cache data is determined as the data to be eliminated and deleted by the cooling degree and the occupied space of the data cached in the storage space, and the data needing to be eliminated in the storage space is selected by considering the cold and hot degrees of the data and the volume of the data.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a data replacement method based on hybrid storage according to an embodiment of the present invention;
FIG. 2 is a flowchart of another data replacement method based on hybrid storage according to an embodiment of the present invention;
FIG. 3 is a flowchart of a data storage method based on hybrid storage according to an embodiment of the present invention;
FIG. 4 is a flowchart of a data access method based on hybrid storage according to an embodiment of the present invention;
FIG. 5 is a block diagram of a hybrid storage based data replacement apparatus according to an embodiment of the present invention;
FIG. 6 is a block diagram of another hybrid storage-based data replacement apparatus according to an embodiment of the present invention;
fig. 7 is a block diagram of a data hybrid storage system according to an embodiment of the present invention;
FIG. 8 is a block diagram of a hybrid-storage data processing system according to an embodiment of the present invention;
FIG. 9 is a block diagram of a data hybrid storage and processing system according to an embodiment of the present invention;
fig. 10 is a block diagram of another data hybrid storage and processing system according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In order to solve the problem that the data hit rate is low due to cache data replacement and the access performance of the data is affected in the mixed storage in the prior art, an embodiment of the present invention provides a data replacement method based on mixed storage, which, referring to fig. 1, at least performs the following steps:
s11, checking the data size of the cache data in the first storage space;
s12, judging whether the data size of the cache data exceeds a preset threshold value, if so, executing S13, and if not, executing S11;
and S13, determining and deleting the data to be eliminated in the cache data according to the cooling degree and the occupied space of each data in the cache data.
To better describe the data replacement method based on hybrid storage provided by the embodiment of the present invention, first, an application environment related to the embodiment of the present invention is briefly described, and the data replacement method based on hybrid storage according to the embodiment of the present invention may be applied to any scenario that requires hybrid storage (two storage methods with different access efficiencies are adopted), for example, a hybrid storage system based on a database, and in particular, a hybrid storage system based on a Redis type database. The technical solutions disclosed in the embodiments of the present invention can be implemented into any possible technical scenarios by those skilled in the art.
In an embodiment, in the above steps S11 and S12, the first storage space may be an internal memory, and since the internal memory has a fast access speed but a small capacity, it is required to check whether the cache data in the storage space exceeds a preset threshold value, so as to replace the cache data in time. Specifically, for example, whether the data size of the cached data in the first storage space exceeds a preset threshold may be checked according to a preset cycle, where the preset threshold may be a ratio of the total capacity of the first storage space, for example, a storage space that is 90% or 80% of the total capacity of the first storage space, and of course, the preset threshold may be set or adjusted according to the actual usage of the internal memory, so as to meet the requirements of the access speed and the efficiency at the same time.
In step S13, the data may be data of a key value group (key-value) or a data block composed of a plurality of key-values. The cache data in the first storage space is data based on a key value structure, keys of all written data and data objects value corresponding to at least part of the keys are cached, in order to ensure more effective data in the cache data and increase the hit rate of the data during data access, not only the cooling degree of the data but also the occupied space of the data are considered in the step, the larger the cooling degree of the data is, the larger the occupied space of the data is, the priority is shifted from the cache data, so that the first storage space can cache more data.
In the embodiment of the present invention, the cooling degree of the data reflects the cooling degree of the data, and in the prior art, the cooling degree of the data is generally divided according to the access frequency and the access frequency of the data, for example, the data with high access frequency of the data may be referred to as hot data, and the data with relatively low access frequency or long data idle time is referred to as cold data. In the embodiment of the invention, although some data have large cooling degree, the data volume is small, the occupied storage space is small, a large amount of data with large cooling degree only occupy small storage space, and the data with small occupied space like cache can increase the hit rate of data access in the cache data and improve the access performance.
According to the data replacement method based on hybrid storage provided by the embodiment of the invention, the data with low heat and large occupied space in the cache data is determined as the data to be eliminated and deleted by the cooling degree and the occupied space of the data cached in the storage space, and the data needing to be eliminated in the storage space is selected by considering the cold and hot degrees of the data and the volume of the data.
In the embodiment of the present invention, in the data replacement method based on hybrid storage, when a data request end sends a data write request to a hybrid storage system, and a first storage space receives data to be written, the data can be written into a second storage space through asynchronous application. For example, the data request end writes data into the first storage space, the key-value pair of the written data is recorded in the first storage space and fed back to the data request end, and the data request end senses the writing of the internal memory and ends data transmission; the hybrid storage system writes data into a log in an aspect of persistence (AOF), and finally writes the read data in the log in a second storage space by using an asynchronous application Async apply thread; in this way, the whole amount of data written in the second storage space includes the cache data written in the first storage space, so that the data to be eliminated can be directly deleted when the cache data is replaced.
In the data replacement method based on hybrid storage, when a data request end requests a hybrid storage system for data access, if cache data contains data to be accessed, the data to be accessed can be directly returned to the data request end; when the cache data does not comprise the data to be accessed, acquiring the data to be accessed from the full data of the second storage space and replacing the data to be accessed to the first storage space; and returning the data to be accessed to the data request end. Because the access frequency and speed of the data in the second storage space are both much lower than those in the first storage space under normal conditions, when the data is accessed and the cache data does not include the data to be accessed, the data cannot be directly hit, and the access performance of the hybrid storage system is affected.
In one or some alternative embodiments, the cooling degree of the data is determined by at least one of the data idle time length, the frequency of data access and the number of data access, wherein the cooling degree of the data is positively correlated with the data idle time length and negatively correlated with the frequency of data access and the number of data access.
In the step S13, determining and deleting to-be-eliminated data in the cache data according to the cooling degree and the occupied space of each data in the cache data, includes:
determining the cooling degree and the occupied space of data in the cache data;
determining the elimination coefficient of the data in the cache data according to the cooling degree and the occupied space of the data;
and determining and deleting the data to be eliminated according to the sequence of eliminating coefficients of the data in the cache data from large to small.
In one embodiment, it may be that when the cooling degree of the data is characterized by the data idle time, the data idle time idletime is obtained by subtracting the last access time of the last time the data was accessed from the current _ time of the data. The frequency of data access may be determined by using a Least Recently Used (LRU) page replacement algorithm in the prior art, and the number of data accesses may be determined by using a Least Frequently Used (Least Frequently Used) page replacement algorithm in the prior art.
And obtaining the cooling degree of the data according to any one of the three modes, and determining the elimination coefficient of the data according to the occupied space of each data in the cache data. For example, the product of the cooling degree and the occupied space size of the data is determined to obtain the elimination coefficient of the data; or, taking logarithm of the size of the occupied space of the data, and determining the product of the logarithm of the size of the occupied space of the data and the cooling degree of the data to obtain the elimination coefficient of the data; or weighting the cooling degree and the occupied space of the data according to preset weight to obtain the elimination coefficient of the data. Of course, after the cooling degree and the occupied space of the data are determined, other mathematical methods in the prior art can be adopted to determine the elimination coefficient of the data, and the two factors can be combined. The embodiment of the present invention is not specifically limited to this, and the elimination coefficient of data may be determined to screen out data to be eliminated.
The elimination cache coefficient obtained by the method determines and deletes the data to be eliminated, because the larger the cooling degree of the data is, the larger the elimination coefficient of the data in the cache data is, and the larger the occupied space of the data is, the larger the elimination coefficient of the data in the cache data is, therefore, the data with the larger cooling degree can be eliminated preferentially, and the data with the larger cooling degree is prevented from occupying the memory; meanwhile, data with larger data volume is eliminated preferentially, so that the internal memory is prevented from being excessively consumed by a small number of data, and other data cannot be loaded into the internal memory.
In one or some alternative embodiments, in order to more conveniently replace the cache data out of the first storage space, it may be that a multi-level storage area is included in the first storage space. Based on this, in the embodiment of the present invention, after determining the elimination coefficient of the data in the cache data according to the free time and the occupied space size of the data, the following steps may be further performed:
and respectively adjusting the data in the cache data to storage areas corresponding to the preset multi-level elimination coefficient threshold range for storage according to the elimination coefficient of the cache data and the preset multi-level elimination coefficient threshold range from large to small.
When the cache data in the first storage space exceeds a preset threshold, determining a storage area of the data to be eliminated according to the sequence from high to low of the elimination coefficient threshold range, and deleting the data in the corresponding storage area.
Specifically, the first storage space may be divided into a plurality of storage areas of different levels, data with a small cache elimination factor may be stored in a storage area of a high level, and data with a large cache elimination factor may be stored in a storage area of a low level. For example, the first storage space is divided into storage areas of high, medium, and low levels, data with different elimination coefficients are respectively cached in the storage areas of different levels, when it is checked that the data size of the cached data in the first storage space exceeds the preset threshold, the data in the storage area of the low level is replaced first, and if the data size of the cached data in the first storage space still exceeds the preset threshold, the data in the storage area of the medium level is replaced, and the data in the storage area of the high level is replaced finally.
The step of adjusting the data in the cache data to the storage areas corresponding to the multi-level elimination coefficient threshold ranges respectively for storage may be performed periodically in specific implementation, and this process may be performed independently from the steps S11 to S13 without strict time sequence.
In a specific embodiment, the first storage space may be an internal memory, and the second storage space may be an external memory. The internal memory is a main memory of the computer system, and the external memory is an auxiliary memory of the computer system, such as a solid state disk or a magnetic disk. In the embodiment of the present invention, the full amount of data in the second storage space includes keys of all written data and corresponding data object values.
In order to better describe the above cache data replacement method provided by the present invention, an embodiment of the present invention is described with reference to a specific embodiment.
The data replacement method based on the hybrid storage provided by the embodiment of the invention can be applied to a storage system with a complex data structure so as to realize reasonable data replacement of cache data, improve the hit rate of the cache data and improve the access performance of the storage system.
The following description will be made by taking a storage system applied to the Redis as an example:
because the Redis database may support multiple data types such as strings, hash tables, lists, sets, ordered sets, bitmaps, and hyper-records, the data size and complexity of the data of each data type may vary greatly, for example, the data object value in different key-value pairs key-value may be a simple string or a long linked list or a large hash table. In the prior art, cache data replacement is generally performed through algorithms such as an LRU (least recently used) algorithm or an LFU (linear data unit) algorithm and the like implemented on a cache line of a Central Processing Unit (CPU) and a page cache of a system, in the method, only by considering access time or frequency of data, the greater the cooling degree of the data is, the more preferentially the data is eliminated, but the size of occupied space of the data cannot be referred to during data replacement, so that the cooling degree is relatively smaller, but the memory space occupies very large data and occupies a large amount of space of an internal memory, so that enough storage space can be released only by eliminating a large amount of data with small occupied space, and proper eliminated data cannot be well selected; meanwhile, a large amount of data occupying a small space is eliminated, so that the amount of data in the cache data is reduced, the hit rate of the cache data is low, and the access performance of the internal memory cannot be well exerted.
In addition, in the prior art, a storage system of an open source Redis database such as Pika and Ardb is also provided, and the storage system reduces the storage cost by storing the whole amount of data in an external storage and directly accessing the data in the external storage through a request processing layer partially compatible with the Redis protocol, but the method has the disadvantages that the cooling degree of the data is not considered, and the direct access to the external storage consumes long time for data replacement, thereby resulting in poor data access performance.
According to the data replacement method based on hybrid storage provided by the embodiment of the invention, when the data volume of the cache data exceeds a preset threshold value, the cooling degree and the occupied space of the data in the cache data are determined; determining a elimination coefficient of data in the cache data according to the idle time and the occupied space of the data; and determining and deleting the data to be eliminated according to the elimination coefficient of each data in the cache data. The difference from the prior art is that the determination method of the data to be eliminated is mainly the determination of the elimination coefficient when the cache data is replaced. Referring to fig. 2, in the data replacement method based on hybrid storage according to the embodiment of the present invention, when executing a cache data elimination task, a check task may be periodically executed according to a preset time length, the size of the data amount of the cache data in the first storage space is checked, and when the data amount of the cache data does not exceed a preset threshold, the cache data elimination task is ended, and the execution is resumed after waiting for a next cycle; and when the data volume of the cache data exceeds a preset threshold value, determining an elimination coefficient of the data in the cache data, and eliminating the cache data in the first storage space according to the determined elimination coefficient. When determining the elimination factor of the data in the cached data, the cooling degree of the data can be characterized by the idle time length of the data. Specifically, the specific steps for executing the data replacement method based on the hybrid storage are as follows:
s10: periodically executing an inspection task;
s11, checking the data size of the cache data in the first storage space;
s12, judging whether the data size of the cache data exceeds a preset threshold value, if so, executing S13, and if not, executing S14;
s13, determining and deleting to-be-eliminated data in the cache data according to the cooling degree and the occupied space of each data in the cache data;
and S14, ending.
In the step S13, the cooling degree of the data is represented by an idle duration of the data, where the idle duration of the data is current _ time according to the current time of the data-last accessed time last _ access _ time of the data; the size of the data space is called use, where use is sum of sizes of all elements in one data object value. And determining the product of the idle time length and the occupied space size of the data according to the description in the embodiment to obtain the obsolete coefficient swap availability of the data. Namely, the formula for calculating the elimination coefficient swap availability of the data can be expressed as follows: swap availability idle use.
Through the mode, when the data to be eliminated are determined, the data cooling degree and the data size are considered, two factors which influence the access performance of the cache data are comprehensively selected, the most suitable data are selected for elimination, the balance is achieved between the cold degree and the data size of data access, the data cached in the internal memory meets the requirement of most data access, the hit rate of the data in the internal memory is improved when the data are accessed, the performance of the Redis type database in a hybrid storage system comprising the internal memory and the external memory is further improved, and in addition, extra storage cost is not required.
Based on the same inventive concept, an embodiment of the present invention further provides a data storage method based on hybrid storage, and as shown in fig. 3, the method includes:
s21, writing and caching the data in the first storage space; writing the data cached in the first storage space into a second storage space;
and replacing the cached data in the first storage space by adopting the following steps:
s22, checking the data size of the cache data in the first storage space;
s23, judging whether the data size of the cache data exceeds a preset threshold value, if so, executing S24, and if not, executing S22;
and S24, determining and deleting the data to be eliminated in the cache data according to the cooling degree and the occupied space of each data in the cache data.
In the data storage method based on hybrid storage provided by the embodiment of the invention, when the first storage space receives the written data, the written data is written into the second storage space through asynchronous application.
In the data storage method based on hybrid storage provided in the embodiment of the present invention, the specific process of replacing the cached data in the first storage space is similar to steps S11-S13 in the data replacement method based on hybrid storage, and the specific process may refer to the manner described in the data replacement method based on hybrid storage in the foregoing embodiment. In the embodiment of the present invention, a specific manner of writing data into the second storage space through the asynchronous application may be a manner in the prior art, and a specific implementation manner is not particularly limited in the embodiment of the present invention.
Based on the same inventive concept, an embodiment of the present invention further provides a data access method based on hybrid storage, and as shown in fig. 4, the method includes:
s31, receiving the data access request sent by the data request terminal,
s32, inquiring whether the data corresponding to the request exists in the cache data of the first storage space, if yes, executing step S34, if no, executing step S33,
s33, reading data to be accessed from the full data of the second storage space and writing the data into the cache data of the first storage space;
s34, returning to the data request end;
and replacing the cached data in the first storage space by adopting the following steps:
s35, checking the data size of the cache data in the first storage space;
s36, judging whether the data size of the cache data exceeds a preset threshold value, if so, executing S37, and if not, executing S35;
and S37, determining and deleting the data to be eliminated in the cache data according to the cooling degree and the occupied space of each data in the cache data.
In the data access method based on hybrid storage provided in the embodiment of the present invention, the specific process of replacing the cached data in the first storage space is similar to steps S11-S13 in the data replacement method based on hybrid storage, and the specific process may refer to the manner described in the data replacement method based on hybrid storage in the foregoing embodiment. In the embodiment of the present invention, any manner in the prior art may be used as a specific manner of reading data to be accessed from the first storage space and/or the second storage space according to the data access request, and a specific implementation manner is not particularly limited in the embodiment of the present invention.
Based on the same inventive concept, embodiments of the present invention further provide a data replacement device, a data hybrid storage system, a hybrid storage data processing system, a data hybrid storage and processing system, and an intelligent device based on hybrid storage, and because the principles of the problems solved by these devices, terminals, and systems are similar to the foregoing cache data replacement method, data hybrid storage method, and hybrid storage data processing method, the implementation of the devices, and systems may refer to the implementation of the foregoing methods, and repeated parts are not described again.
An embodiment of the present invention provides a data replacement device 1 based on hybrid storage, as shown in fig. 5, including:
the checking module 101 is configured to check a data size of the cache data in the first storage space;
the judging module 102 is configured to judge whether a data size of the cache data exceeds a preset threshold;
the determining module 103 is configured to determine to-be-eliminated data in the cache data according to the cooling degree and the occupied space of each data in the cache data when the determining module 102 determines that the data size of the cache data exceeds the preset threshold;
and the deleting module 104 is configured to delete the data to be eliminated.
In one or some optional embodiments, the cooling degree of the data is determined by at least one of the data idle time length, the data access frequency and the data access times, wherein the cooling degree of the data is positively correlated with the data idle time length and negatively correlated with the data access frequency and the data access times.
In one or some optional embodiments, the determining module 103 is specifically configured to determine a cooling degree and an occupied space size of data in the cache data;
determining the elimination coefficient of the data in the cache data according to the cooling degree and the occupied space of the data;
and determining the data to be eliminated according to the sequence of eliminating coefficients of all data in the cache data from large to small.
In one or some alternative embodiments, the determining module 103 is configured to determine the elimination factor of the data by one or a combination of the following methods:
determining the product of the cooling degree and the occupied space size of the data to obtain the elimination coefficient of the data;
or, taking logarithm of the size of the occupied space of the data, and determining the product of the logarithm of the size of the occupied space of the data and the cooling degree of the data to obtain the elimination coefficient of the data;
or weighting the cooling degree and the occupied space of the data according to preset weight to obtain the elimination coefficient of the data.
In one or some alternative embodiments, the first storage space comprises a multi-level storage area; after determining the elimination factor of the data in the cache data according to the cooling degree and the occupied space of the data, referring to fig. 6, the cache data replacement apparatus further includes:
and the adjusting storage module 105 is configured to adjust the data in the cache data to storage areas corresponding to preset multi-level elimination coefficient threshold ranges for storage according to the elimination coefficients of the cache data and the preset multi-level elimination coefficient threshold ranges in descending order of the elimination coefficients.
In one or some optional embodiments, the deleting module 104 is specifically configured to, when the cached data in the first storage space exceeds a preset threshold, determine, according to an order from a high elimination coefficient threshold range to a low elimination coefficient threshold range, a storage area in which the data is to be eliminated, and delete the data in the corresponding storage area.
In one or some alternative embodiments, the first storage space described in the cache data replacement apparatus is an internal memory.
The embodiment of the invention also provides a data hybrid storage system, which is shown in fig. 7 and comprises a data writing device 2 and the data replacement device 1 based on hybrid storage;
the data writing device 2 is used for writing and buffering data in a first storage space; and writing the cache data in the first storage space into the second storage space.
An embodiment of the present invention further provides a hybrid storage data processing system, and as shown in fig. 8, the system includes: the data reading device 3 and the data replacement device 1 based on the hybrid storage;
the data reading device 3 is configured to receive a data access request sent by a client, and query data to be accessed from cache data in the first storage space; when the cache data of the first storage space does not comprise the data to be accessed, acquiring the data to be accessed from the full data of the second storage space and writing the data to be accessed into the cache data of the first storage space; and returning the data to be accessed to the client.
Referring to fig. 9, the system includes a first storage space 4, a second storage space 5, and the hybrid storage-based data replacement device 1;
the first storage space 4 is used for storing cache data;
the second storage space 5 is used for storing the full amount of data.
In one or some alternative embodiments, referring to fig. 10, the data hybrid storage and processing system further includes: and a persistence means 6 for writing the data written to the first storage space 4 to the log by using the AOF method.
In a specific embodiment, the data hybrid storage and processing system further includes a protocol parsing layer 7 at an upper layer, where the protocol parsing layer 7 sends a write data request to the first storage space 4, writes data into the first storage space 4, and the first storage space 4 records a key-value pair key-value of the write data, and feeds the key-value pair key-value back to the protocol parsing layer 7, and ends data transmission. Meanwhile, the persistence device 6 of the data hybrid storage and processing system writes the data written into the first storage space 4 into the log in a persistence (AOF) manner, and finally writes the data in the read log into the second storage space 5 by using an asynchronous application Async apply thread, so that the whole data written into the second storage space 5 contains the cache data written into the first storage space 4, and therefore, when the cache data is replaced, the data to be eliminated in the first storage space 4 can be directly deleted.
When the protocol analysis layer 7 sends an access data request to the first storage space 4 and searches for data to be accessed from the first storage space 4, if the cache data of the first storage space 4 contains the data to be accessed, the data to be accessed can be directly returned to the protocol analysis layer 7; when the cache data does not include the data to be accessed, the data to be accessed is acquired from the full data of the second storage space 5 based on the data replacement device 1 with mixed storage, and is replaced to the first storage space 4, and the first storage space 4 returns the data to be accessed to the protocol analysis layer 7.
An embodiment of the present invention further provides an intelligent device, including: the data replacement method based on the hybrid storage comprises the following steps of storing a memory, a processor and computer instructions stored on the memory and capable of running on the processor, wherein the instructions can realize the data replacement method based on the hybrid storage when being executed by the processor, or can realize the data storage method based on the hybrid storage when being executed by the processor, and the instructions can realize the data processing method based on the hybrid storage when being executed by the processor.
Embodiments of the present invention also provide a computer-readable storage medium, where the instructions, when executed by a processor, can implement the above-mentioned data replacement method based on hybrid storage, or the instructions, when executed by the processor, can implement the above-mentioned data storage method based on hybrid storage, or the instructions, when executed by the processor, can implement the above-mentioned data processing method based on hybrid storage.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (15)

1. A data replacement method based on hybrid storage comprises the following steps:
checking whether the data size of the cache data in the first storage space exceeds a preset threshold value;
if so, determining and deleting the data to be eliminated in the cache data according to the cooling degree and the occupied space of each data in the cache data.
2. The method of claim 1, wherein the degree of cooling of the data is determined by at least one of a data idle period, a frequency of data access, and a number of data accesses, wherein the degree of cooling of the data is positively correlated with the data idle period and negatively correlated with the frequency of data access and the number of data accesses;
the determining and deleting the data to be eliminated in the cache data according to the cooling degree and the occupied space of each data in the cache data comprises the following steps:
determining the cooling degree and the occupied space of data in the cache data;
determining the elimination coefficient of the data in the cache data according to the cooling degree and the occupied space of the data;
and determining and deleting the data to be eliminated according to the sequence of eliminating coefficients of the data in the cache data from large to small.
3. The method of claim 2, determining culling coefficients for data by one or a combination of:
determining the product of the cooling degree and the occupied space size of the data to obtain the elimination coefficient of the data;
or, taking logarithm of the size of the occupied space of the data, and determining the product of the logarithm of the size of the occupied space of the data and the cooling degree of the data to obtain the elimination coefficient of the data;
or weighting the cooling degree and the occupied space of the data according to preset weight to obtain the elimination coefficient of the data.
4. The method of claim 2, the first storage space comprising a multi-level storage area; after determining the elimination coefficient of the data in the cached data according to the cooling degree and the occupied space of the data, the method further comprises the following steps:
and respectively adjusting the data in the cache data to the storage areas corresponding to the preset multi-level elimination coefficient threshold range for storage according to the elimination coefficient of the cache data and the preset multi-level elimination coefficient threshold range from large to small.
5. The method of claim 4, when the size of the data amount of the cache data in the first storage space exceeds a preset threshold, determining the storage areas of the data to be eliminated according to the descending order of the elimination coefficient threshold range, and deleting the data in the corresponding storage areas.
6. The method according to any of claims 1-5, wherein the first storage space is an internal memory.
7. A data storage method based on hybrid storage comprises the following steps: writing and caching data in a first storage space; writing the data cached in the first storage space into a second storage space;
the data cached in the first storage space is replaced by using the data replacement method based on hybrid storage according to any one of claims 1 to 6.
8. The method of claim 7, further comprising: and when the first storage space receives the written data, writing the data into the second storage space through the asynchronous application.
9. A data access method based on hybrid storage, comprising: receiving a data access request sent by a data request end, and inquiring whether data corresponding to the request exists in cache data of a first storage space; if not, reading the data to be accessed from the full data of the second storage space, writing the data into the cache data of the first storage space, and returning the data to the data request end;
the data cached in the first storage space is replaced by using the data replacement method based on hybrid storage according to any one of claims 1 to 6.
10. A hybrid storage based data replacement apparatus, comprising:
the checking module is used for checking the data size of the cache data in the first storage space;
the judging module is used for judging whether the data size of the cache data exceeds a preset threshold value or not;
the determining module is used for determining to-be-eliminated data in the cache data according to the cooling degree and the occupied space of each data in the cache data when the judging module determines that the data size of the cache data exceeds a preset threshold;
and the deleting module is used for deleting the data to be eliminated.
11. A data hybrid storage system comprising a data writing means and the hybrid storage based data replacement means of claim 10;
the data writing device is used for writing and caching data in a first storage space; and writing the cache data in the first storage space into the second storage space.
12. A hybrid-storage data processing system, comprising: a data reading device and the hybrid storage based data permutation device of claim 10;
the data reading device is used for receiving a data access request sent by a client and inquiring data to be accessed from cache data of the first storage space; when the cache data of the first storage space does not comprise the data to be accessed, acquiring the data to be accessed from the full data of the second storage space and writing the data to be accessed into the cache data of the first storage space; and returning the data to be accessed to the client.
13. A data hybrid storage and processing system comprising a first storage space, a second storage space, and the hybrid storage based data replacement device of claim 10;
the first storage space is used for storing cache data; the second storage space is used for storing full data.
14. A smart device, comprising: memory, processor and computer instructions stored on the memory and executable on the processor, the instructions being capable of implementing the hybrid storage based data replacement method according to any one of claims 1 to 6 when executed by the processor or the instructions being capable of implementing the hybrid storage based data storage method according to claim 7 or 8 when executed by the processor, the instructions being capable of implementing the hybrid storage based data processing method according to claim 9 when executed by the processor.
15. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, are capable of implementing the hybrid storage based data permutation method according to any one of claims 1-6, or which, when executed by a processor, are capable of implementing the hybrid storage based data storage method according to claim 7 or 8, or which, when executed by a processor, are capable of implementing the hybrid storage based data processing method according to claim 9.
CN202010294232.8A 2020-04-15 2020-04-15 Data replacement method based on hybrid storage, related method, device and system Pending CN113297106A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010294232.8A CN113297106A (en) 2020-04-15 2020-04-15 Data replacement method based on hybrid storage, related method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010294232.8A CN113297106A (en) 2020-04-15 2020-04-15 Data replacement method based on hybrid storage, related method, device and system

Publications (1)

Publication Number Publication Date
CN113297106A true CN113297106A (en) 2021-08-24

Family

ID=77318564

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010294232.8A Pending CN113297106A (en) 2020-04-15 2020-04-15 Data replacement method based on hybrid storage, related method, device and system

Country Status (1)

Country Link
CN (1) CN113297106A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023273659A1 (en) * 2021-06-30 2023-01-05 华为技术有限公司 Cache file management method and apparatus

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023273659A1 (en) * 2021-06-30 2023-01-05 华为技术有限公司 Cache file management method and apparatus

Similar Documents

Publication Publication Date Title
US10430338B2 (en) Selectively reading data from cache and primary storage based on whether cache is overloaded
CN102760101B (en) SSD-based (Solid State Disk) cache management method and system
CN105205014B (en) A kind of date storage method and device
US8868831B2 (en) Caching data between a database server and a storage system
CN107491523B (en) Method and device for storing data object
US8825959B1 (en) Method and apparatus for using data access time prediction for improving data buffering policies
US9727479B1 (en) Compressing portions of a buffer cache using an LRU queue
CN105302840B (en) A kind of buffer memory management method and equipment
CN110795363B (en) Hot page prediction method and page scheduling method of storage medium
CN101887398A (en) Method and system for dynamically enhancing input/output (I/O) throughput of server
CN110532200B (en) Memory system based on hybrid memory architecture
CN112148736A (en) Method, device and storage medium for caching data
CN116560562A (en) Method and device for reading and writing data
CN113297106A (en) Data replacement method based on hybrid storage, related method, device and system
US7529891B2 (en) Balanced prefetching exploiting structured data
CN116027982A (en) Data processing method, device and readable storage medium
CN116501249A (en) Method for reducing repeated data read-write of GPU memory and related equipment
CN109582233A (en) A kind of caching method and device of data
CN115934583A (en) Hierarchical caching method, device and system
CN112445794B (en) Caching method of big data system
US10339069B2 (en) Caching large objects in a computer system with mixed data warehousing and online transaction processing workload
CN117539915B (en) Data processing method and related device
US11853221B2 (en) Dynamic prefetching of data from storage
US11789632B1 (en) System and method for data placement in multiple tier storage systems
CN115374301B (en) Cache device, method and system for realizing graph query based on cache device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40058639

Country of ref document: HK