CN115878677B - Data processing method and device for distributed multi-level cache - Google Patents

Data processing method and device for distributed multi-level cache Download PDF

Info

Publication number
CN115878677B
CN115878677B CN202310086503.4A CN202310086503A CN115878677B CN 115878677 B CN115878677 B CN 115878677B CN 202310086503 A CN202310086503 A CN 202310086503A CN 115878677 B CN115878677 B CN 115878677B
Authority
CN
China
Prior art keywords
data object
data objects
local cache
priority list
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310086503.4A
Other languages
Chinese (zh)
Other versions
CN115878677A (en
Inventor
陈大伟
朱路明
张立群
徐莉萍
张庆丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XCMG Hanyun Technologies Co Ltd
Original Assignee
XCMG Hanyun Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by XCMG Hanyun Technologies Co Ltd filed Critical XCMG Hanyun Technologies Co Ltd
Priority to CN202310086503.4A priority Critical patent/CN115878677B/en
Publication of CN115878677A publication Critical patent/CN115878677A/en
Application granted granted Critical
Publication of CN115878677B publication Critical patent/CN115878677B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a data processing method and device for a distributed multi-level cache, wherein corresponding local caches are respectively constructed for a plurality of application instances of a target application, an associated front priority list is respectively constructed for each local cache, a unique identifier set of a second number of data objects with highest heat metric values is obtained from a unique identifier set of a first number of data objects in the associated front priority list of each local cache, the second number of data objects and the heat metric values thereof are written into each local cache, and a third number of data objects except the unique identifier set of the second number of data objects in the unique identifier set of the first number of data objects are distributively written into each cache node in the second-level distributed cache. The method and the device can improve the hit rate of the data object in the distributed multi-level cache, and improve the response performance of the target application to the data access request of the client under high concurrency.

Description

Data processing method and device for distributed multi-level cache
Technical Field
The application relates to the technical field of computer software, in particular to a data processing method and device for distributed multi-level caching.
Background
In a JAVA-based distributed application system, in order to meet the requirement of high-concurrency system access performance, a distributed multi-level cache architecture is constructed by adopting a local cache of a JVM (JAVA virtual machine) and a distributed cache such as Redis, memcache, which is a feasible solution. In this distributed multi-level cache architecture, the local cache of the JVM is mainly used as a first-level cache, and an external distributed cache such as Redis, memcache is used as a second-level cache. In order to improve the data access performance of the system, hot spot data objects need to be cached in a local cache with limited capacity, so as to reduce the access frequency to a secondary cache and a database at the back end.
In the distributed multi-level cache architecture, the data objects in the local cache and the secondary cache are processed based on conventional LRU (The LeastRecently Used, least recently used), LFU (Least frequentlyused ) and other data elimination strategies, which are common cache data replacement strategies. The LRU policy caches the data objects according to the access recency of the data objects, and the LFU policy caches the data objects according to the access frequency of the data objects, but both the two approaches may cause a substantial cold data object to replace a hot data object in the cache, which reduces the hit rate of the data object in the distributed multi-level cache, thereby affecting the access performance of the target application under high concurrency.
Disclosure of Invention
In view of this, the present application proposes a data processing method and apparatus for a distributed multi-level cache, so as to prevent a substantial cold data object from replacing a hot data object in the distributed multi-level cache, improve the hit rate of the data object in the distributed multi-level cache, and improve the access performance of a target application under high concurrency.
In a first aspect of the present application, a data processing method for a distributed multi-level cache is provided, including:
respectively constructing a corresponding local cache for each application instance in a plurality of application instances of a target application, and obtaining a plurality of local caches respectively corresponding to the plurality of application instances;
respectively constructing a pre-priority list for each local cache in the plurality of local caches, and obtaining a plurality of pre-priority lists respectively associated with the plurality of local caches, wherein the pre-priority list is used for recording unique identifiers and heat metric values of a first number of data objects with highest heat metric values accessed by a client in each application instance in real time and sequencing the data objects according to the heat metric values of the data objects;
obtaining a unique identifier set of a second number of data objects with highest heat metric values from the unique identifier sets of the first number of data objects in the pre-priority list associated with each local cache, obtaining the second number of data objects according to the unique identifier sets of the second number of data objects, writing the second number of data objects and the heat metric values thereof into each local cache, obtaining a third number of data objects according to the unique identifier sets of the third number of data objects except the unique identifier sets of the second number of data objects in the unique identifier sets of the first number of data objects, and writing the third number of data objects into each cache node in the second-level distributed cache in a distributed manner.
In some embodiments, the heat metric value includes a difference between a read count and an update count for each data object monitored in real-time by each application instance.
In some embodiments, the method further comprises:
each application instance responds to an access request of a client to a target data object, calculates a heat metric value of the target data object in real time, and updates a front priority list associated with a local cache corresponding to each application instance according to the heat metric value of the target data object;
judging whether the target data object hits in the local cache corresponding to each application instance, if not, searching from the secondary distributed cache or a database node at the back end to obtain the target data object;
judging whether the heat metric value of the target data object is larger than the minimum heat metric value of the data object in the local cache corresponding to each application instance; if so, replacing the data object with the minimum heat metric value and the heat metric value thereof in the local cache corresponding to each application instance with the target data object and the heat metric value thereof.
In some embodiments, the updating the pre-priority list associated with the local cache corresponding to each application instance according to the thermal metric value of the target data object includes:
determining the position of the unique identifier of the target data object in the pre-priority list according to the comparison of the heat metric value of the target data object and the heat metric value of the data object in the pre-priority list associated with the local cache corresponding to each application instance;
updating the pre-priority list according to the position of the unique identifier of the target data object in the pre-priority list, so that the pre-priority list keeps the unique identifiers of the first number of data objects with the highest recorded heat metric values and the heat metric values.
In some embodiments, the method further comprises:
and if the target data object hits in the local cache corresponding to each application instance, updating the position and the thermal measurement value of the target data object in the local cache corresponding to each application instance according to the thermal measurement value of the target data object.
In some embodiments, the method further comprises:
Combining and sequencing the plurality of pre-priority lists to obtain a global pre-priority list, wherein the global pre-priority list records unique identifiers and global heat metric values of a first number of data objects with the highest global heat metric values accessed by clients in the plurality of application instances;
comparing the global pre-priority list with the pre-priority list associated with each local cache, and obtaining a difference set between a unique identifier set of a second number of data objects with highest global heat metric values in the global pre-priority list and a unique identifier set of a second number of data objects with highest heat metric values in the pre-priority list associated with each local cache;
updating the pre-priority list associated with each local cache according to the global hot metric value of any data object identified by the unique identifier in the difference set;
judging whether the global heat metric value of any data object is larger than the minimum heat metric value of the data object in the local cache; if so, searching any data object from the second-level distributed cache or the database node at the back end, and replacing the data object with the minimum heat metric value and the heat metric value thereof in each local cache with any data object and the global heat metric value thereof.
In some embodiments, the updating the pre-priority list associated with each local cache according to the global hot metric value for any data object identified by the unique identifier in the difference set comprises:
determining the position of the unique identifier of any one data object in the pre-priority list associated with each local cache according to the comparison of the global heat metric value of any data object and the heat metric values of the data objects in the pre-priority list associated with each local cache;
updating the pre-priority list associated with each local cache according to the position of the unique identifier of any data object in the pre-priority list associated with each local cache, so that the pre-priority list associated with each local cache keeps the unique identifier and the heat metric value of the first number of data objects with the highest recorded heat metric value.
In some embodiments, the method further comprises:
calculating a first average hit rate of the set of unique identifiers of the second number of data objects in the local cache and a second average hit rate of the set of unique identifiers of the third number of data objects in the second level distributed cache in the pre-priority list associated with each local cache in each metric cycle;
And judging whether the first average hit rate and the second average hit rate meet the condition that the first average hit rate is smaller than a preset threshold value and the second average hit rate is larger than the preset threshold value, if so, attenuating the heat metric values of all the data objects in the pre-priority list and the local cache proportionally until the first average hit rate is larger than or equal to the preset threshold value in the subsequent measurement period.
In some embodiments, the first average hit rate is a ratio of a total number of hits in the local cache for the set of unique identifiers of the second number of data objects to the second number, and the second average hit rate is a ratio of a total number of hits in the second level distributed cache for the set of unique identifiers of the third number of data objects to the third number.
In a second aspect of the present application, a data processing apparatus for distributed multi-level caching is further provided, including:
the cache construction unit is used for respectively constructing corresponding local caches for each application instance in a plurality of application instances of the target application to obtain a plurality of local caches respectively corresponding to the plurality of application instances;
The data monitoring unit is used for respectively constructing a pre-priority list for each local cache in the plurality of local caches, obtaining a plurality of pre-priority lists respectively associated with the plurality of local caches, and recording unique identifiers and heat metric values of a first number of data objects with highest heat metric values accessed by a client in each application instance in real time and sequencing the data objects according to the heat metric values of the data objects;
and the cache processing unit is used for acquiring the unique identifier set of the second number of data objects with highest heat metric value from the unique identifier set of the first number of data objects in the pre-priority list associated with each local cache, acquiring the second number of data objects according to the unique identifier set of the second number of data objects, writing the second number of data objects and the heat metric value thereof into each local cache, acquiring the third number of data objects according to the unique identifier set of the third number of data objects except the unique identifier set of the second number of data objects in the unique identifier set of the first number of data objects, and writing the third number of data objects into each cache node in the second-level distributed cache in a distributed manner.
The data processing method and device of the distributed multi-level cache respectively construct an associated pre-priority list for the local cache corresponding to each application instance of the target application, record the unique identifier and the heat metric value of the first number of data objects with the highest heat metric value accessed by the client in each application instance in real time, acquire the unique identifier set of the second number of data objects with the highest heat metric value from the unique identifier set of the first number of data objects in the pre-priority list associated with each local cache, acquire the second number of data objects according to the unique identifier set of the second number of data objects, writing the second number of data objects and the heat metric value thereof into each local cache, obtaining the third number of data objects according to the unique identifier set of the third number of data objects in the unique identifier set of the first number of data objects except the unique identifier set of the second number of data objects, and writing the third number of data objects into each cache node in the two-level distributed cache in a distributed manner, thereby effectively preventing the problem that the hot spot data objects in the distributed multi-level cache are replaced by the substantial cold data objects accessed by the client in the application instance, remarkably improving the hit rate of the data objects in the distributed multi-level cache, and improving the response performance of the high-concurrency target application to the data access request of the client.
Drawings
FIG. 1 is a schematic architecture diagram of an application scenario to which a data processing method of a distributed multi-level cache of the present application is applicable;
FIG. 2 is a flow diagram of a method of data processing for distributed multi-level caching according to an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating the working principle of a data processing method of a distributed multi-level cache according to an embodiment of the present application;
FIG. 4 is a partial flow diagram of a data processing method of a distributed multi-level cache according to another embodiment of the present application;
FIG. 5 is a partial flow diagram of a data processing method of a distributed multi-level cache according to another embodiment of the present application;
FIG. 6 is a partial flow diagram of a data processing method of a distributed multi-level cache according to another embodiment of the present application;
FIG. 7 is a partial flow diagram of a data processing method of a distributed multi-level cache according to another embodiment of the present application;
FIG. 8 is a partial flow diagram of a data processing method of a distributed multi-level cache according to another embodiment of the present application;
FIG. 9 is a schematic diagram of a distributed multi-level cached data processing device according to an embodiment of the present application;
FIG. 10 is a partial schematic diagram of a data processing apparatus of a distributed multi-level cache according to another embodiment of the present application;
FIG. 11 is a partial schematic diagram of a data processing apparatus of a distributed multi-level cache according to another embodiment of the present application;
FIG. 12 is a partial schematic diagram of a data processing apparatus of a distributed multi-level cache according to another embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present application. However, it should be understood that the described embodiments are only some, but not all, of the exemplary embodiments of the present application and, therefore, the following detailed description of the embodiments of the present application is not intended to limit the scope of the claims of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
It should be noted that the terms "first," "second," and the like in the description and in the claims of this application are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order, and are not to be construed as indicating or implying relative importance.
Fig. 1 is a schematic architecture diagram of an application scenario to which the data processing method of a distributed multi-level cache of the present application is applicable. As shown in fig. 1, in the system architecture, a plurality of local caches corresponding to a plurality of application instances of a target application and a plurality of distributed cache nodes form a distributed multi-level cache, wherein the plurality of local caches corresponding to the plurality of application instances of the target application are first-level caches, and the plurality of distributed cache nodes form second-level caches. An application instance (instance) of a target application is also called a service instance, and can provide an application process of external service for the target application, wherein each application instance has a corresponding local cache, and the local cache is a certain capacity of memory space in the application process of the application instance.
The local cache corresponding to each application instance adopts a Key value pair { Key: value } format to cache the data object, where the key is a unique identifier of the data object and Value is a Value of the data object. In order to ensure the data access response performance under the high concurrent user access request, the hot spot data object should be cached as much as possible in the local cache corresponding to each application instance. When the application instance receives the data access request of the client, the data object is searched in the corresponding local cache, and if the data object does not hit, the data object is further searched in the distributed cache node of the second-level cache. In one embodiment, the secondary cache may be implemented using a Redis in-memory database, and each distributed cache node may be a node in a Redis in-memory database cluster. Redis is an open source memory-based distributed Key-value pair { Key: value } stores a database. In contrast to local caches, data objects of sub-level warmth may be preloaded from the database node at the back-end to each distributed cache node of the secondary cache. The data object is distributed to each distributed cache node according to the hash operation of the Key Key, and when the local cache cannot hit, the application instance searches in the corresponding distributed cache node according to the hash operation of the Key of the data object. If the secondary cache is not hit, the data object is read from the database node at the back end and loaded into the distributed cache node, and the data object is returned to the client.
As described above, in the application scenario of such a distributed multi-level cache, processing data objects in the local cache and the secondary cache based on the conventional data elimination policies such as LRU (least recently used), LFU (least frequently used), etc. may cause substantial cold data objects to replace hot data objects in the cache, which reduces the hit rate of data objects in the distributed multi-level cache, and thus affects the access performance of the target application under high concurrency. In the present application, hot data objects refer to data objects with very high read frequency but low modification frequency, and cold data objects refer to data objects with very low access frequency or very high modification frequency.
Therefore, the application provides a data processing method and device for distributed multi-level caching, so as to solve the technical problems.
FIG. 2 is a flow chart of a method of data processing for distributed multi-level caching according to an embodiment of the present application. As shown in fig. 2, the data processing method of the distributed multi-level cache of the present application includes the following steps:
step S201, respectively constructing a corresponding local cache for each application instance in a plurality of application instances of a target application, and obtaining a plurality of local caches respectively corresponding to the plurality of application instances;
Step S202, respectively constructing a pre-priority list for each local cache in the plurality of local caches, and obtaining a plurality of pre-priority lists respectively associated with the plurality of local caches, wherein the pre-priority list is used for recording unique identifiers and heat metric values of a first number of data objects with highest heat metric values accessed by a client in each application instance in real time, and sorting according to the heat metric values of the data objects;
step S203, obtaining a unique identifier set of a second number of data objects with highest heat metric values from the unique identifier sets of the first number of data objects in the pre-priority list associated with each local cache, obtaining the second number of data objects according to the unique identifier sets of the second number of data objects, writing the second number of data objects and the heat metric values thereof into each local cache, obtaining a third number of data objects according to the unique identifier sets of the third number of data objects in the unique identifier sets of the first number of data objects except the unique identifier sets of the second number of data objects, and writing the third number of data objects into each cache node in the second distributed cache in a distributed manner.
In this embodiment, multiple application instances of the target application may run on one or more server nodes in the Java distributed application system. Each application instance of the target application is respectively allocated with a corresponding local cache, wherein the local cache is a memory space with a predetermined capacity in an application process of each application instance. Each application instance of the target application independently constructs a pre-priority list associated with the local cache, where the pre-priority list is used to record, in real time, unique identifiers and heat metric values of a first number of data objects with highest heat metric values accessed by the client in each application instance.
The heat metric value is a quantitative metric value in this application that measures the heat level of each data object accessed by the client in each application instance. In one embodiment, the thermal metric value may include a difference between a read count and an update count of each data object monitored in real-time by each application instance. For each data object, the greater the number of reads, the greater the degree of warmth reflecting the data object, and the greater the number of updates, the lesser the degree of warmth reflecting the data object. Thus, in order to increase the hit rate of the data objects in the local cache, the local cache corresponding to each application instance needs to store the data object with the highest heat degree as much as possible, that is, the more the number of reads, the less the number of updates. The actual heat degree of each data object accessed by the client in each application instance can be effectively represented by the difference between the reading times and the updating times of each data object.
Assume that the local cache associated with each application instance has a pre-priorityThe level list is expressed as
Figure SMS_3
The unique identifier of any one data object in the pre-priority list is expressed as +.>
Figure SMS_4
,/>
Figure SMS_7
P represents the pre-priority list corresponding to the local cache +.>
Figure SMS_2
The number of unique identifiers of the data object in (1) then the data object +.>
Figure SMS_5
Heat measurement value ∈10>
Figure SMS_8
Can be calculated by the following formula, i.e. +.>
Figure SMS_10
Wherein->
Figure SMS_1
Representing the unique identifier->
Figure SMS_6
Number of reads of the identified data object, +.>
Figure SMS_9
Representing the unique identifier->
Figure SMS_11
The number of updates of the identified data object.
In order to better record in real time and allocate among the multiple levels of caches the hot spot data objects accessed by the client in each application instance, the number P of unique identifiers of the data objects in the pre-priority list associated with the local cache corresponding to each application instance is at least greater than the number N of data objects in the local cache.Assume that the set of data objects in the local cache corresponding to each application instance is represented as
Figure SMS_12
Data object set in local cache +.>
Figure SMS_13
The number N of data objects in (a) is smaller than the pre-priority list associated with the local cache +.>
Figure SMS_14
The number P of unique identifiers of the data objects in (i.e.) >
Figure SMS_15
In this embodiment, the pre-priority list associated with each local cache pre-records the unique identifier (Key) of the top P data objects with the highest heat metric value and its heat metric value in all data objects accessed by the client in each application instance. The pre-priority list associated with each local cache may occupy a predetermined amount of memory space in the application process of each application instance as the local cache.
In one embodiment, the unique identifiers of the P data objects in the pre-priority list are arranged in a reverse order, or in a low-to-high order, of the thermal metric values for each data object.
And each application instance acquires the unique identifiers of the first N data objects with the largest heat metric values from the pre-priority list, reads the values corresponding to the N data objects from the secondary distributed cache or the database node at the back end according to the unique identifiers of the N data objects, and writes the key value pairs of the N data objects and the heat metric values of the N data objects into the local cache corresponding to each application instance. Likewise, the data object sets in the local cache may also be ordered in order of high to low or low to high thermal metric values.
In this embodiment, the pre-priority list associated with each local cache may be divided into two sub-lists according to a minimum heat metric value of the data objects written into the local cache, that is, a first pre-priority list and a second pre-priority list, where the first pre-priority list is a list formed by unique identifier sets of N data objects written into the local cache, and the second pre-priority list is a list formed by unique identifier sets of P-N data objects with heat metric values smaller than the minimum heat metric value. In this embodiment, after writing the data objects identified by the N unique identifiers in the first pre-priority list into the associated local cache, the identified data objects may be further acquired from the database node at the back end according to the P-N unique identifiers in the second pre-priority list, and then written into each distributed cache node in the second distributed cache in a distributed manner. For each application instance, the local cache serving as the first-level cache stores the hottest data object set in the pre-priority list, and the distributed cache node serving as the second-level cache stores the hottest data object set in the pre-priority list.
In order to better explain the technical solution of the present embodiment, an exemplary explanation is made below with reference to the schematic diagram of the working principle of the pre-priority list shown in fig. 3. As shown in fig. 3, each application instance (application instance 1, application instance 2,..the application instance X) of the target application constructs a pre-priority list 302 for the local cache 303, taking the application instance 1 as an example, when the application instance 1 receives an access request 301 of a client to a target data object, the thermal metric value H of the target data object is calculated in real time, and the unique identifier (Key) of the target data object and its thermal metric value H are recorded in the pre-priority list 302 according to the size of the target data object and in a predetermined order. The unique identifier (Key) of the top P data objects with the highest heat metric value among all data objects accessed by clients in application example 1 is recorded in the pre-priority list 302. Application instance 1 obtains the unique identifiers of the first N data objects with the largest hot metric Value from the unique identifier set of the P data objects in the pre-priority list 302, reads the values Value corresponding to the N data objects from the secondary distributed cache or the database node at the back end according to the unique identifiers of the N data objects, and pairs the Key values { Key: value and the hot metric Value H for the N data objects are written into the local cache 303 corresponding to application instance 1. Meanwhile, the application example 1 obtains the unique identifier of the P-N data object with the later hot metric value from the unique identifier set of the P data objects in the pre-priority list 302, and obtains the identified data object from the database node at the back end to write the identified data object into each distributed cache node 304 in the two-level distributed cache in a distributed manner. Therefore, the data processing mode can effectively prevent the problem that a hot spot data object in the distributed multi-level cache is replaced by a substantial cold data object in an application instance, remarkably improve the hit rate of the data object in the distributed multi-level cache, and improve the response performance of a target application under high concurrency to a data access request of a client.
In summary, the embodiment of the application respectively constructs an associated pre-priority list for a local cache corresponding to each application instance of a target application, records in real time unique identifiers and heat metric values of a first number of data objects with highest heat metric values accessed by a client in each application instance, acquires a unique identifier set of a second number of data objects with highest heat metric values from the unique identifier set of the first number of data objects in the pre-priority list associated with each local cache, acquires the second number of data objects according to the unique identifier set of the second number of data objects, writes the second number of data objects and the heat metric values thereof into each local cache, and obtaining a third number of data objects according to the unique identifier sets of the third number of data objects in the unique identifier sets of the first number of data objects except the unique identifier sets of the second number of data objects, and writing the third number of data objects into each cache node in the second-level distributed cache in a distributed manner, so that the problem that a hot data object in the distributed multi-level cache is replaced by a substantial cold data object accessed by a client in an application instance is effectively prevented, the hit rate of the data objects in the distributed multi-level cache is remarkably improved, and the response performance of a high-concurrency target application to a data access request of the client is improved.
In some embodiments, based on the foregoing examples, as shown in fig. 4, the method may further include the steps of:
step S401, each application instance responds to an access request of a client to a target data object, calculates a heat metric value of the target data object in real time, and updates a front priority list associated with a local cache corresponding to each application instance according to the heat metric value of the target data object;
step S402, determining whether the target data object hits in the local cache corresponding to each application instance, and if not, searching from the second-level distributed cache or the database node at the back end to obtain the target data object;
step S403, determining whether the heat metric value of the target data object is greater than the minimum heat metric value of the data object in the local cache corresponding to each application instance; if so, replacing the data object with the minimum heat metric value and the heat metric value thereof in the local cache corresponding to each application instance with the target data object and the heat metric value thereof.
In this embodiment, for an access request of a client to a target data object, each application instance of a target application calculates and obtains an latest heat metric value of the target data object in real time, and updates a pre-priority list associated with a local cache corresponding to each application instance according to the heat metric value of the target data object.
Then, the embodiment determines whether the target data object hits in the local cache, if not, the access data object is searched in a distributed cache node in the second-level distributed cache according to the unique identifier of the target data object; if the target data object is not found in the second-level distributed cache, the target data object is found from the database node at the back end and written into the distributed cache node of the second-level distributed cache.
And then, after searching to obtain the target data object, judging whether the heat metric value of the target data object is larger than the minimum heat metric value of the data object in the local cache, if so, determining the position of the target data object in the local cache according to the heat metric value of the target data object, and writing the target data object and the heat metric value thereof into the local cache. When the number of data objects in the local cache reaches N and maintains the maximum number N, writing the target data object and its heat metric value into the local cache means replacing and cleaning the data object and its heat metric value of the minimum heat metric value in the data object set in the local cache.
In one embodiment, the method may further comprise:
and if the target data object hits in the local cache corresponding to each application instance, updating the position and the thermal measurement value of the target data object in the local cache corresponding to each application instance according to the thermal measurement value of the target data object.
In this embodiment, since the data object sets in the local cache may also be ordered in the order of the hot metric value from high to low or from low to high, the positions of the target data objects in the local cache may be updated simultaneously when the hot metric value of the target data object in the local cache is updated.
Further, for the data object replaced by the target data object in the local cache, whether the replaced data object in the local cache is cached in the distributed cache nodes of the second-level distributed cache can be further judged, if not, the replaced data object in the local cache can be distributed and written into each distributed cache node of the second-level distributed cache.
In summary, in this embodiment, each application instance responds to an access request of a client to a target data object, calculates a heat metric value of the target data object in real time, updates a pre-priority list associated with the local cache according to the heat metric value of the target data object, determines whether the target data object hits in the local cache, if not, searches from a database node in the second-level distributed cache or in the back end to obtain the target data object, and determines whether the heat metric value of the target data object is greater than a minimum heat metric value of the data object in the local cache; if so, the target data object and the heat metric value thereof are used for replacing the data object with the minimum heat metric value and the heat metric value thereof in the local cache corresponding to each application instance, so that each application instance, after receiving a new data access request, records the latest heat metric value of the target data object accessed by the client in real time, and when the local cache is not hit, the target data object is written into the local cache only when the heat metric value of the target data object is larger than the minimum heat metric value in the local cache, otherwise, the target data object is stored in the second-level distributed cache, thereby effectively preventing the problem that the hot data object in the distributed multi-level cache is replaced by the substantial cold data object accessed by the client in the application instance, improving the hit rate of the data object in the distributed multi-level cache, and improving the response performance of the high-concurrency target application to the data access request of the client.
In one embodiment, based on any of the foregoing embodiments, as shown in fig. 5, step S401 of updating the pre-priority list associated with the local cache corresponding to each application instance according to the thermal metric value of the target data object may include the following steps:
step S501, determining a position of a unique identifier of the target data object in the pre-priority list according to a comparison between the thermal metric value of the target data object and the thermal metric value of the data object in the pre-priority list associated with the local cache corresponding to each application instance;
step S502, updating the pre-priority list according to the position of the unique identifier of the target data object in the pre-priority list, so that the pre-priority list keeps the unique identifier of the first number of data objects with the highest recorded heat metric value and the heat metric value.
The embodiment determines the position of the unique identifier of the target data object in the pre-priority list through comparing the heat metric value of the target data object with the heat metric values of the data objects in the pre-priority list, and updates the pre-priority list based on the position of the unique identifier of the target data object in the pre-priority list, wherein the updated pre-priority list always keeps record of the unique identifiers of the first number of data objects with the highest heat metric values of each application instance accessed by the client and the heat metric values thereof.
In some embodiments, on the basis of any one of the foregoing embodiments, as shown in fig. 6, the method may further include the steps of:
step S601, merging and sorting the plurality of pre-priority lists to obtain a global pre-priority list, wherein the global pre-priority list records unique identifiers of a first number of data objects with highest global heat metric values and global heat metric values which are accessed by clients in the plurality of application instances;
step S602, comparing the global pre-priority list with the pre-priority list associated with each local cache, to obtain a difference set between a unique identifier set of a second number of data objects with highest global heat metric values in the global pre-priority list and a unique identifier set of a second number of data objects with highest heat metric values in the pre-priority list associated with each local cache;
step S603, updating the pre-priority list associated with each local cache according to the global hot metric value of any data object identified by the unique identifier in the difference set;
step S604, determining whether the global heat metric value of any one of the data objects is greater than the minimum heat metric value of the data objects in each local cache; if so, searching any data object from the second-level distributed cache or the database node at the back end, and replacing the data object with the minimum heat metric value and the heat metric value thereof in each local cache with any data object and the global heat metric value thereof.
In this embodiment, the pre-priority lists associated with the local caches corresponding to the multiple application instances may be aggregated in the application dimension, and a global pre-priority list of all application instances of the target application may be constructed, where the global pre-priority list is formed by merging and sorting the pre-priority lists associated with the local caches corresponding to the multiple application instances, and taking unique identifiers of the first P data objects with the largest global thermal metric value. The global heat metric value is the maximum heat metric value of the data object in a pre-priority list associated with the local caches corresponding to the application instances respectively. If the data objects in the global pre-priority list exist in the pre-priority list associated with the local caches corresponding to the application instances, the largest heat metric value is reserved as the global heat metric value, and the ordering position of the data objects in the global pre-priority list is determined according to the global heat metric value of the combined data objects.
The present embodiment then compares the global pre-priority list with the pre-priority list associated with each local cache, and specifically, takes the difference set of the unique identifier sets of the N data objects with the highest global heat metric values in the global pre-priority list relative to the unique identifier sets of the N data objects with the highest heat metric values in the pre-priority list associated with each local cache. And updating the pre-priority list associated with each local cache according to the global heat metric value of any data object identified by the unique identifier in the difference set. At the same time, global heat metric values of the data objects identified by the unique identifiers in the difference set are respectively compared with minimum heat metric values of the data objects in each local cache. And judging whether the global heat metric value of the data object identified by the unique identifier in the difference set is larger than the minimum heat metric value of the data object in each local cache, if so, determining the position of the data object in the difference set in the local cache according to the global heat metric value of the data object identified by the unique identifier in the difference set, writing the data object in the difference set and the global heat metric value thereof into each local cache, and replacing the data object and the heat metric value of the minimum heat metric value in each local cache with the data object and the global heat metric value thereof in the difference set.
The solution of the embodiment may be implemented in combination with a Zookeeper distributed application coordination service, where Zookeeper is a distributed open source distributed application coordination service, each application instance of a target application may be registered as a temporary node on the Zookeeper, each application instance may periodically send a locally cached associated pre-priority list to a corresponding temporary node, and a background management process may obtain the pre-priority list sent by each application instance by subscribing each temporary node, merge the pre-priority lists into a global pre-priority list, and accordingly obtain a difference set of the global pre-priority list and the pre-priority list associated with each local cache, and notify each application instance.
Further, for the replaced data object in the local cache, whether the replaced data object in the local cache is cached in the distributed cache node of the second-level distributed cache can be further judged, if not, the replaced data object in the local cache can be distributed and written into each distributed cache node of the second-level distributed cache.
In summary, this embodiment obtains a global pre-priority list by merging and sorting a plurality of pre-priority lists respectively associated with the plurality of local caches, compares the global pre-priority list with a pre-priority list associated with each local cache, obtains a difference set between a unique identifier set of a second number of data objects with highest global heat metric values in the global pre-priority list and a unique identifier set of a second number of data objects with highest heat metric values in the pre-priority list associated with each local cache, when a global heat metric value of any data object identified by a unique identifier in the difference set is greater than a minimum heat metric value of a data object in each local cache, replaces the data object with the minimum heat metric value in each local cache and the heat metric value thereof with the data object with the global heat metric value thereof, and writes the difference set between the unique identifier set of the data object with the highest global heat metric values in the global pre-priority list and the unique identifier set of the data object with the second number of data objects with the highest global heat metric values in the global pre-priority list, so as to further obtain global hot spot data objects in application dimensions, and write the difference set into the unique identifier set to prevent the global hot spot data objects from hitting each local cache from substantially and improving the client-side data access performance of the client-side from substantially in response to the hot-side.
In some embodiments, on the basis of any of the foregoing embodiments, as shown in fig. 7, the step S603 updates the pre-priority list associated with each local cache according to a global hot metric value of any data object identified by a unique identifier in the difference set, and may further include the following steps:
step S701, determining a position of the unique identifier of the any data object in the pre-priority list associated with each local cache according to a comparison between the global heat metric value of the any data object and the heat metric value of the data object in the pre-priority list associated with each local cache;
step S702, updating the pre-priority list associated with each local cache according to the position of the unique identifier of any data object in the pre-priority list associated with each local cache, so that the pre-priority list associated with each local cache keeps the unique identifier and the thermal metric value of the first number of data objects with the highest recorded thermal metric value.
The method further comprises the steps of comparing the global heat metric value of any data object identified by the unique identifier in the difference set with the heat metric value of any data object in the pre-priority list associated with the local cache, determining the position of the unique identifier of any data object in the pre-priority list associated with the local cache, updating the pre-priority list based on the position of the unique identifier of any data object in the pre-priority list associated with the local cache, and keeping the updated pre-priority list always in record of the unique identifier of the first number of data objects with the highest heat metric value accessed by the client side by the application and the heat metric thereof.
In some embodiments, on the basis of any one of the foregoing embodiments, as shown in fig. 8, the method may further include the steps of:
step S801, calculating a first average hit rate of the unique identifier set of the second number of data objects in the pre-priority list associated with each local cache in each metric period and a second average hit rate of the unique identifier set of the third number of data objects in the second level distributed cache;
step S802, determining whether the first average hit rate and the second average hit rate meet the condition that the first average hit rate is smaller than a predetermined threshold and the second average hit rate is larger than the predetermined threshold, if so, attenuating the thermal metric values of all the data objects in the pre-priority list and the local cache proportionally until the first average hit rate is larger than or equal to the predetermined threshold in the subsequent measurement period.
In this embodiment, the first average hit rate is used to characterize an average hit rate in the local cache of the unique identifier set (corresponding to the first pre-priority list in the foregoing embodiment) of the second number of data objects in the pre-priority list associated with each local cache in each metric cycle, and the second average hit rate is used to characterize an average hit rate in the second distributed cache of the unique identifier set (corresponding to the second pre-priority list in the foregoing embodiment) of the third number of data objects in the pre-priority list associated with each local cache in each metric cycle. In one embodiment, the first average hit rate is a ratio of a total number of hits in the each local cache for the set of unique identifiers of the second number of data objects to the second number, and the second average hit rate is a ratio of a total number of hits in the second level distributed cache for the set of unique identifiers of the third number of data objects to the third number.
It is assumed that in one metric period, a certain application instance handles C data access requests of the client, wherein,
Figure SMS_16
. If the total number of the unique identifier set of the data object in the first pre-priority list in the pre-priority list associated with the local cache corresponding to the application instance in the local cache is +.>
Figure SMS_17
First average hit rate->
Figure SMS_18
Can be calculated by the following formula, +.>
Figure SMS_19
. If the total number of the unique identifier set of the data object in the second pre-priority list in the pre-priority list associated with the local cache corresponding to the application instance in the second-level distributed cache is +.>
Figure SMS_20
Second average hit rate->
Figure SMS_21
Can be calculated by the following formula, +.>
Figure SMS_22
At each measurement cycle, it is determined whether the first average hit rate and the second average hit rate satisfy a condition that the first average hit rate is less than a predetermined threshold and the second average hit rate is greater than the predetermined threshold. If yes, the overall heat trend of the data objects written into the second pre-priority list of the second pre-priority list exceeds the overall heat trend of the data objects written into the first pre-priority list of the local cache, namely the data objects in the second pre-priority list become hot gradually. At this time, the heat metric values of all the data objects in the pre-priority list and the local cache may be attenuated proportionally, so as to shorten the gap between the heat metric value of the data object in the second pre-priority list and the heat metric value of the data object in the first pre-priority list, so that the data object in the second pre-priority list may be written into the local cache faster in the subsequent measurement period, and replace the old data object that is gradually cooled in the first pre-priority list until the first average hit rate is greater than or equal to the predetermined threshold.
In one embodiment, scaling down the heat metric values of all data objects in the pre-priority list and the local cache may include halving the decay, i.e., decaying the heat metric values of all data objects in the pre-priority list and the local cache to 1/2 of the original heat metric value.
The first average hit rate of the unique identifier set of the second number of data objects in the local cache and the second average hit rate of the unique identifier set of the third number of data objects in the second-level distributed cache in each measurement period are calculated, and when the first average hit rate is smaller than a preset threshold and the second average hit rate is larger than the preset threshold, the heat measurement values of all the data objects in the front priority list and the local cache are attenuated proportionally until the first average hit rate in the subsequent measurement period is larger than or equal to the preset threshold. Therefore, the problem that the hot data object in the distributed multi-level cache is replaced by the substantial cold data object accessed by the client in each application instance can be further prevented, the hit rate of the data object in the distributed multi-level cache is improved, and the response performance of the high-concurrency target application to the data access request of the client is improved.
FIG. 9 is a schematic diagram of a distributed multi-level cache data processing apparatus according to an embodiment of the present application. As shown in fig. 9, the data processing apparatus of the distributed multi-level cache of the present application includes the following units:
a cache construction unit 901, configured to respectively construct a corresponding local cache for each application instance in a plurality of application instances of a target application, and obtain a plurality of local caches corresponding to the plurality of application instances respectively;
a data monitoring unit 902, configured to construct a pre-priority list for each local cache of the plurality of local caches, obtain a plurality of pre-priority lists associated with the plurality of local caches, and record, in real time, unique identifiers and heat metric values of a first number of data objects with highest heat metric values accessed by a client in each application instance, and sort the data objects according to the heat metric values of the data objects;
the cache processing unit 903 is configured to obtain a unique identifier set of a second number of data objects with highest heat metric values from a unique identifier set of a first number of data objects in the pre-priority list associated with each local cache, obtain the second number of data objects according to the unique identifier set of the second number of data objects, write the second number of data objects and their heat metric values into each local cache, and obtain a third number of data objects according to a unique identifier set of a third number of data objects in the unique identifier set of the first number of data objects, except for the unique identifier set of the second number of data objects, and write the third number of data objects into each cache node in the second distributed cache in a distributed manner.
In some embodiments, on the basis of the foregoing examples, as shown in fig. 10, the apparatus may further include the following units:
a first list updating unit 1001, configured to, in response to an access request of a client to a target data object, calculate a hot metric value of the target data object in real time, and update a pre-priority list associated with a local cache corresponding to each application instance according to the hot metric value of the target data object;
a hit result determining unit 1002, configured to determine whether the target data object hits in the local cache corresponding to each application instance, and if not hit, find the target data object from the second-level distributed cache or a database node at the back end;
a first data replacing unit 1003, configured to determine whether a heat metric value of the target data object is greater than a minimum heat metric value of a data object in a local cache corresponding to each application instance; if so, replacing the data object with the minimum heat metric value and the heat metric value thereof in the local cache corresponding to each application instance with the target data object and the heat metric value thereof.
In one embodiment, the apparatus may further include:
and the cache updating unit is used for updating the position and the heat metric value of the target data object in the local cache corresponding to each application instance according to the heat metric value of the target data object if the target data object hits in the local cache corresponding to each application instance.
In an embodiment, on the basis of any of the foregoing embodiments, the first list updating unit 1001 may be further configured to:
determining the position of the unique identifier of the target data object in the pre-priority list according to the comparison of the heat metric value of the target data object and the heat metric value of the data object in the pre-priority list associated with the local cache corresponding to each application instance;
updating the pre-priority list according to the position of the unique identifier of the target data object in the pre-priority list, so that the pre-priority list keeps the unique identifiers of the first number of data objects with the highest recorded heat metric values and the heat metric values.
In some embodiments, on the basis of any of the foregoing embodiments, as shown in fig. 11, the apparatus may further include the following units:
A list aggregation unit 1101, configured to merge and sort the plurality of pre-priority lists to obtain a global pre-priority list, where the global pre-priority list records unique identifiers of a first number of data objects with highest global heat metric values accessed by a client in the plurality of application instances and the global heat metric values;
a difference set comparing unit 1102, configured to compare the global pre-priority list with the pre-priority list associated with each local cache, and obtain a difference set between a unique identifier set of a second number of data objects with the highest global heat metric value in the global pre-priority list and a unique identifier set of a second number of data objects with the highest heat metric value in the pre-priority list associated with each local cache;
a second list updating unit 1103, configured to update the pre-priority list associated with each local cache according to the global hot metric value of any data object identified by the unique identifier in the difference set;
a second data replacement unit 1104, configured to determine whether the global heat metric value of the any data object is greater than the minimum heat metric value of the data object in each local cache; if so, searching any data object from the second-level distributed cache or the database node at the back end, and replacing the data object with the minimum heat metric value and the heat metric value thereof in each local cache with any data object and the global heat metric value thereof.
In some embodiments, on the basis of any one of the foregoing embodiments, the second list updating unit 1103 may be further configured to:
determining the position of the unique identifier of any one data object in the pre-priority list associated with each local cache according to the comparison of the global heat metric value of any data object and the heat metric values of the data objects in the pre-priority list associated with each local cache;
updating the pre-priority list associated with each local cache according to the position of the unique identifier of any data object in the pre-priority list associated with each local cache, so that the pre-priority list associated with each local cache keeps the unique identifier and the heat metric value of the first number of data objects with the highest recorded heat metric value.
In some embodiments, on the basis of any of the foregoing embodiments, as shown in fig. 12, the apparatus may further include the following units:
a hit calculation unit 1201, configured to calculate a first average hit rate of the unique identifier set of the second number of data objects in the pre-priority list associated with each local cache in each metric cycle and a second average hit rate of the unique identifier set of the third number of data objects in the second-level distributed cache;
And a heat attenuation unit 1202, configured to determine whether the first average hit rate and the second average hit rate meet a condition that the first average hit rate is smaller than a predetermined threshold and the second average hit rate is greater than the predetermined threshold, if so, attenuate heat metric values of all data objects in the pre-priority list and the local cache proportionally until the first average hit rate in a subsequent measurement period is greater than or equal to the predetermined threshold.
In summary, the data processing method and apparatus for a distributed multi-level cache according to the embodiments of the present application respectively construct an associated pre-priority list for a local cache corresponding to each application instance of a target application, record in real time a unique identifier and a thermal metric value of a first number of data objects having a highest thermal metric value accessed by a client in each application instance, obtain a unique identifier set of a second number of data objects having a highest thermal metric value from a unique identifier set of the first number of data objects in the pre-priority list associated with each local cache, obtain the second number of data objects according to the unique identifier set of the second number of data objects, write the second number of data objects and the thermal metric value thereof into each local cache, obtain the third number of data objects except the unique identifier set of the second number of data objects in the unique identifier set of the first number of data objects, write the third number of data objects into the unique identifier set of data objects of the first number of data objects, and write the second number of data objects into the unique identifier set of data objects of the second number of data objects of the first application, thereby substantially improving the data objects of the target application to substantially and significantly improve the cache access performance of the distributed client-level cache access requests.
It should be noted that, as those skilled in the art can understand, the different embodiments described in the method embodiments of the present application, the explanation and the achieved technical effects thereof are also applicable to the device embodiments of the present application, and are not repeated herein.
Further, the embodiment of the application also provides an electronic device, which may include: one or more processors and memory. Wherein the memory stores computer program instructions that the one or more processors can invoke to perform all or part of the steps of the methods described in any of the embodiments of the present application. The computer program instructions in the memory described above may be embodied in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product.
Further, the present application also provides a computer program product comprising a non-transitory computer readable storage medium storing a computer program capable of performing all or part of the steps of the method of any of the embodiments of the present application when the computer readable storage medium is connected to a computer device, the computer program being executed by one or more processors of the computer device.
Further, the present application also provides a non-transitory computer readable storage medium having stored thereon a computer program executable by one or more processors to perform all or part of the steps of the methods described in any of the embodiments of the present application.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments of the present application may be implemented by software or by a combination of software and necessary general hardware platforms, and of course may be implemented by hardware functions. Based on such understanding, the technical solutions of the present application may be embodied in essence or in a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device, including for example but not limited to a personal computer, a server, or a network device, to perform all or part of the steps of the method of any of the embodiments of the present application. The aforementioned storage medium may include: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), a magnetic disk, or an optical disk, or other various media capable of storing computer program code.
The above describes exemplary embodiments of the present application, it should be understood that the above-described exemplary embodiments are not limiting, but rather illustrative, and the scope of the present application is not limited thereto. It will be appreciated that modifications and variations to the embodiments of the present application may be made by those skilled in the art without departing from the spirit and scope of the present application, and such modifications and variations are intended to be within the scope of the present application.

Claims (8)

1. A data processing method for a distributed multi-level cache, comprising:
respectively constructing a corresponding local cache for each application instance in a plurality of application instances of a target application, and obtaining a plurality of local caches respectively corresponding to the plurality of application instances;
respectively constructing a pre-priority list for each local cache in the plurality of local caches, and obtaining a plurality of pre-priority lists respectively associated with the plurality of local caches, wherein the pre-priority list is used for recording unique identifiers and heat metric values of a first number of data objects with highest heat metric values accessed by clients in each application instance in real time, and sorting according to the heat metric values of the data objects, and the heat metric values comprise differences between reading times and updating times of each data object monitored in real time by each application instance;
Acquiring a unique identifier set of a second number of data objects with highest heat metric values from the unique identifier set of a first number of data objects in a pre-priority list associated with each local cache, acquiring the second number of data objects according to the unique identifier set of the second number of data objects, writing the second number of data objects and the heat metric values thereof into each local cache, acquiring a third number of data objects according to the unique identifier set of the third number of data objects except the unique identifier set of the second number of data objects in the unique identifier set of the first number of data objects, and writing the third number of data objects into each cache node in a second-level distributed cache in a distributed manner;
each application instance responds to an access request of a client to a target data object, calculates a heat metric value of the target data object in real time, and updates a front priority list associated with a local cache corresponding to each application instance according to the heat metric value of the target data object;
judging whether the target data object hits in the local cache corresponding to each application instance, if not, searching from the secondary distributed cache or a database node at the back end to obtain the target data object;
Judging whether the heat metric value of the target data object is larger than the minimum heat metric value of the data object in the local cache corresponding to each application instance; if so, replacing the data object with the minimum heat metric value and the heat metric value thereof in the local cache corresponding to each application instance with the target data object and the heat metric value thereof.
2. The method according to claim 1, wherein updating the local cache associated pre-priority list corresponding to each application instance according to the hot metric value of the target data object comprises:
determining the position of the unique identifier of the target data object in the pre-priority list according to the comparison of the heat metric value of the target data object and the heat metric value of the data object in the pre-priority list associated with the local cache corresponding to each application instance;
updating the pre-priority list according to the position of the unique identifier of the target data object in the pre-priority list, so that the pre-priority list keeps the unique identifiers of the first number of data objects with the highest recorded heat metric values and the heat metric values.
3. The method of data processing for a distributed multi-level cache of claim 2, further comprising:
and if the target data object hits in the local cache corresponding to each application instance, updating the position and the thermal measurement value of the target data object in the local cache corresponding to each application instance according to the thermal measurement value of the target data object.
4. The method of data processing for a distributed multi-level cache of claim 1, further comprising:
combining and sequencing the plurality of pre-priority lists to obtain a global pre-priority list, wherein the global pre-priority list records unique identifiers and global heat metric values of a first number of data objects with the highest global heat metric values accessed by clients in the plurality of application instances;
comparing the global pre-priority list with the pre-priority list associated with each local cache, and obtaining a difference set between a unique identifier set of a second number of data objects with highest global heat metric values in the global pre-priority list and a unique identifier set of a second number of data objects with highest heat metric values in the pre-priority list associated with each local cache;
Updating the pre-priority list associated with each local cache according to the global hot metric value of any data object identified by the unique identifier in the difference set;
judging whether the global heat metric value of any data object is larger than the minimum heat metric value of the data object in the local cache; if so, searching any data object from the second-level distributed cache or the database node at the back end, and replacing the data object with the minimum heat metric value and the heat metric value thereof in each local cache with any data object and the global heat metric value thereof.
5. The method of claim 4, wherein updating the pre-priority list associated with each local cache based on the global hot metric value for any data object identified by the unique identifier in the difference set comprises:
determining the position of the unique identifier of any one data object in the pre-priority list associated with each local cache according to the comparison of the global heat metric value of any data object and the heat metric values of the data objects in the pre-priority list associated with each local cache;
Updating the pre-priority list associated with each local cache according to the position of the unique identifier of any data object in the pre-priority list associated with each local cache, so that the pre-priority list associated with each local cache keeps the unique identifier and the heat metric value of the first number of data objects with the highest recorded heat metric value.
6. The method of data processing for a distributed multi-level cache of claim 1, further comprising:
calculating a first average hit rate of the set of unique identifiers of the second number of data objects in the local cache and a second average hit rate of the set of unique identifiers of the third number of data objects in the second level distributed cache in the pre-priority list associated with each local cache in each metric cycle;
and judging whether the first average hit rate and the second average hit rate meet the condition that the first average hit rate is smaller than a preset threshold value and the second average hit rate is larger than the preset threshold value, if so, attenuating the heat metric values of all the data objects in the pre-priority list and the local cache proportionally until the first average hit rate is larger than or equal to the preset threshold value in the subsequent measurement period.
7. The method of claim 6, wherein the first average hit rate is a ratio of a total number of hits in the local cache for the second number of unique identifier sets of data objects to the second number, and wherein the second average hit rate is a ratio of a total number of hits in the second level distributed cache for the third number of unique identifier sets of data objects to the third number.
8. A data processing apparatus for distributed multi-level caching, comprising:
the cache construction unit is used for respectively constructing corresponding local caches for each application instance in a plurality of application instances of the target application to obtain a plurality of local caches respectively corresponding to the plurality of application instances;
the data monitoring unit is used for respectively constructing a pre-priority list for each local cache in the plurality of local caches, obtaining a plurality of pre-priority lists respectively associated with the plurality of local caches, wherein the pre-priority list is used for recording unique identifiers and heat metric values of a first number of data objects with highest heat metric values accessed by a client in each application instance in real time and sequencing the data objects according to the heat metric values of the data objects, and the heat metric values comprise differences between the reading times and the updating times of each data object monitored in real time by each application instance;
A cache processing unit, configured to obtain a unique identifier set of a second number of data objects with highest heat metric values from a unique identifier set of a first number of data objects in a pre-priority list associated with each local cache, obtain the second number of data objects according to the unique identifier set of the second number of data objects, write the second number of data objects and the heat metric values thereof into each local cache, and obtain a third number of data objects according to a unique identifier set of a third number of data objects in the unique identifier set of the first number of data objects, except the unique identifier set of the second number of data objects, and write the third number of data objects into each cache node in a second-level distributed cache in a distributed manner;
a first list updating unit, configured to, in response to an access request of a client to a target data object, calculate a hot metric value of the target data object in real time, and update a pre-priority list associated with a local cache corresponding to each application instance according to the hot metric value of the target data object;
The hit result judging unit is used for judging whether the target data object hits in the local cache corresponding to each application instance, and if the target data object does not hit, the target data object is obtained by searching from the secondary distributed cache or the database node at the rear end;
a first data replacement unit, configured to determine whether a heat metric value of the target data object is greater than a minimum heat metric value of a data object in a local cache corresponding to each application instance; if so, replacing the data object with the minimum heat metric value and the heat metric value thereof in the local cache corresponding to each application instance with the target data object and the heat metric value thereof.
CN202310086503.4A 2023-02-09 2023-02-09 Data processing method and device for distributed multi-level cache Active CN115878677B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310086503.4A CN115878677B (en) 2023-02-09 2023-02-09 Data processing method and device for distributed multi-level cache

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310086503.4A CN115878677B (en) 2023-02-09 2023-02-09 Data processing method and device for distributed multi-level cache

Publications (2)

Publication Number Publication Date
CN115878677A CN115878677A (en) 2023-03-31
CN115878677B true CN115878677B (en) 2023-05-12

Family

ID=85760859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310086503.4A Active CN115878677B (en) 2023-02-09 2023-02-09 Data processing method and device for distributed multi-level cache

Country Status (1)

Country Link
CN (1) CN115878677B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106527988B (en) * 2016-11-04 2019-07-26 郑州云海信息技术有限公司 A kind of method and device of solid state hard disk Data Migration
CN112506973B (en) * 2020-12-14 2023-12-15 中国银联股份有限公司 Method and device for managing storage data
CN113761321A (en) * 2021-08-06 2021-12-07 广州华多网络科技有限公司 Data access control method, data cache control method, data access control device, data cache control device, and medium
CN115168244A (en) * 2022-07-29 2022-10-11 苏州浪潮智能科技有限公司 Data updating method, device, equipment and readable storage medium

Also Published As

Publication number Publication date
CN115878677A (en) 2023-03-31

Similar Documents

Publication Publication Date Title
US10198363B2 (en) Reducing data I/O using in-memory data structures
EP3210121B1 (en) Cache optimization technique for large working data sets
US10133679B2 (en) Read cache management method and apparatus based on solid state drive
US10769126B1 (en) Data entropy reduction across stream shard
US9122631B2 (en) Buffer management strategies for flash-based storage systems
CN107491523B (en) Method and device for storing data object
US10409728B2 (en) File access predication using counter based eviction policies at the file and page level
US10877680B2 (en) Data processing method and apparatus
US20090094200A1 (en) Method for Admission-controlled Caching
CN108139872B (en) Cache management method, cache controller and computer system
WO2018040167A1 (en) Data caching method and apparatus
JP6402647B2 (en) Data arrangement program, data arrangement apparatus, and data arrangement method
US11461239B2 (en) Method and apparatus for buffering data blocks, computer device, and computer-readable storage medium
CN112148690A (en) File caching method, file access request processing method and device
CN109002400B (en) Content-aware computer cache management system and method
CN111581218A (en) Method for accelerating access to key value data storage based on log structure merged tree by using double granularity
US9851925B2 (en) Data allocation control apparatus and data allocation control method
CN115878677B (en) Data processing method and device for distributed multi-level cache
CN115934583A (en) Hierarchical caching method, device and system
US11899642B2 (en) System and method using hash table with a set of frequently-accessed buckets and a set of less frequently-accessed buckets
CN112445794B (en) Caching method of big data system
Li et al. Improving read performance of LSM-tree based KV stores via dual grained caches
KR102529333B1 (en) Cache management apparatus and method
CN114579514B (en) File processing method, device and equipment based on multiple computing nodes
US20240160617A1 (en) Garbage collection based on metadata indicating unmodified objects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant