CN116775700A

CN116775700A - Data caching method, device and storage medium

Info

Publication number: CN116775700A
Application number: CN202211429929.7A
Authority: CN
Inventors: 张彪
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Suzhou Software Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Suzhou Software Technology Co Ltd
Priority date: 2022-11-15
Filing date: 2022-11-15
Publication date: 2023-09-19

Abstract

The application discloses a data caching method, a data caching device and a storage medium: under the condition that the first node acquires the data to be cached based on the data acquisition request, determining a caching result based on the data to be cached by using the first node; under the condition that the caching result is that the data to be cached is cached to the corresponding secondary cache, the first node is utilized to cache the identification information of the data to be cached in the tertiary cache; subscribing the identification information by utilizing nodes which receive the data acquisition request from the plurality of nodes; and under the condition that the second node is utilized to clear the data to be cached, utilizing the nodes subscribed with the identification information in the plurality of nodes to respectively clear the data to be cached in the corresponding second-level caches. Through the technical scheme, the consistency of the data cache is improved.

Description

Data caching method, device and storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a data caching method, device and storage medium.

Background

The distributed cache is widely applied to various network application systems at present, and particularly under the condition of huge traffic, the distributed cache can well support high concurrent data access, and each node included in the distributed service cluster is provided with a local cache. Because the local caches are scattered in each node, time difference exists among each node, or abnormal conditions such as update failure and the like exist in the data in the local caches, the local cache data of each node are inconsistent.

Disclosure of Invention

In order to solve the technical problems, the embodiments of the present invention expect to provide a data caching method, apparatus, and storage medium, where a distributed service cluster is configured, where a second level cache includes nodes that subscribe to the same identification information in a third level cache, so that when any node clears the cache data, other nodes that subscribe to the identification information can clear the same data corresponding to the second level cache in time, thereby improving consistency of data caching.

The technical scheme of the invention is realized as follows:

the invention provides a data caching method, which is applied to a distributed service cluster, wherein the distributed service cluster comprises a plurality of nodes, each node in the plurality of nodes respectively comprises a corresponding primary cache and a corresponding secondary cache, and the plurality of nodes also share the same tertiary cache and database, and the method comprises the following steps:

under the condition that data to be cached is acquired by a first node based on a data acquisition request, determining a caching result based on the data to be cached by the first node; the first node is any node in the plurality of nodes, and the data to be cached is obtained from the corresponding primary cache, secondary cache, tertiary cache or database;

Under the condition that the caching result is that the data to be cached is cached to the corresponding second-level cache, the first node is utilized to cache the identification information of the data to be cached in the third-level cache;

subscribing the identification information by utilizing the nodes which receive the data acquisition request from the plurality of nodes;

under the condition that the second node is utilized to clear the data to be cached, nodes subscribing the identification information in the plurality of nodes are utilized to respectively clear the data to be cached in the corresponding second-level cache; the second node is any node subscribed to the identification information in the plurality of nodes.

In the above method, the acquiring, by the first node, the data to be cached based on the data acquisition request includes:

under the condition that the data acquisition request is acquired by the first node, searching cache data matched with the data acquisition request from a corresponding level-one cache by the first node;

under the condition that the first node does not find the cache data matched with the data acquisition request from the corresponding first-level cache, the first node is utilized to find the cache data matched with the data acquisition request from the corresponding second-level cache;

Under the condition that the first node does not find the cache data matched with the data acquisition request from the corresponding second-level cache, the first node is utilized to find the cache data matched with the data acquisition request from the third-level cache;

and under the condition that the first node does not find the cache data matched with the data acquisition request from the three-level cache, the first node is utilized to find the data matched with the data acquisition request from the database, and the found data is determined to be the data to be cached.

In the above method, the searching the data matched with the data obtaining request from the database, and determining the searched data as the data to be cached includes:

generating a target data acquisition task based on the data acquisition request, and splitting the target data acquisition task into a plurality of data acquisition subtasks;

determining target nodes for executing each subtask in the plurality of data acquisition subtasks from the plurality of nodes to obtain a plurality of target nodes corresponding to the plurality of data acquisition subtasks one by one;

respectively executing the plurality of data acquisition subtasks by using the plurality of target nodes to obtain a plurality of task results corresponding to the plurality of data acquisition subtasks one by one;

And carrying out data aggregation on the task results, and determining the aggregated data as the data to be cached.

In the above method, the determining a cache result based on the data to be cached includes:

acquiring data characteristic information of the data to be cached; the data characteristic information at least comprises one or more of the data size of the data to be cached and the number of data requests;

under the condition that the importance degree of the data represented by the data characteristic information is larger than or equal to a first preset degree threshold value, determining the caching result as a second-level cache corresponding to the data to be cached to the first node;

and under the condition that the importance degree of the data represented by the data characteristic information is smaller than the first preset degree threshold value, determining the caching result to be caching the data to be cached to the three-level cache.

In the above method, after the determining that the buffering result is that the data to be buffered is buffered to the second level buffer corresponding to the first node, the method further includes:

writing the data to be cached into a local file system of the first node under the condition that the importance degree of the data represented by the data characteristic information is larger than a second preset degree threshold;

Under the condition that the first node is restarted, reading the data to be cached from the local file system;

and under the condition that the first node clears the data to be cached in the corresponding secondary cache, clearing the data to be cached from the local file system.

In the above method, after determining the cache result based on the data to be cached, the method further includes:

and setting the data expiration time of the data to be cached based on the data characteristic information of the data to be cached under the condition that the data to be cached is cached based on the caching result.

In the above method, after subscribing to the identification information respectively by the nodes that receive the data acquisition request from the plurality of nodes, the method further includes:

when the communication service subscribing the identification information in the three-level cache is abnormal and the second node clears the data to be cached, notifying nodes which are different from the second node in the plurality of nodes by using message buses of the plurality of nodes;

and respectively clearing the data to be cached in the corresponding secondary cache by utilizing nodes which are different from the second node in the plurality of nodes.

The invention provides a data caching device, which is applied to a distributed service cluster, wherein the distributed service cluster comprises a plurality of nodes, each node in the plurality of nodes respectively comprises a corresponding primary cache and a corresponding secondary cache, and the plurality of nodes also share the same tertiary cache and database, and the data caching device comprises:

the determining module is used for determining a caching result based on the data to be cached by utilizing the first node under the condition that the data to be cached is acquired by utilizing the first node based on the data acquisition request; the first node is any node in the plurality of nodes, and the data to be cached is obtained from the corresponding primary cache, secondary cache, tertiary cache or database;

the buffer module is used for buffering the identification information of the data to be buffered in the three-level buffer memory by using the first node under the condition that the buffer memory result is that the data to be buffered is buffered in the corresponding second-level buffer memory;

the subscription module is used for subscribing the identification information by utilizing the nodes which receive the data acquisition request from the plurality of nodes;

the clearing module is used for respectively clearing the data to be cached in the corresponding secondary cache by utilizing the nodes subscribed with the identification information in the plurality of nodes under the condition that the second node is used for clearing the data to be cached; the second node is any node subscribed to the identification information in the plurality of nodes.

The invention provides a data caching device, comprising: a processor, a memory, and a communication bus;

the communication bus is used for realizing communication connection between the processor and the memory;

the processor is configured to execute the computer program stored in the memory, so as to implement the data caching method.

The present invention provides a computer readable storage medium storing one or more computer programs executable by one or more processors to implement the above described data caching method.

The invention provides a data caching method, a device and a storage medium, wherein the method comprises the following steps: under the condition that the first node acquires the data to be cached based on the data acquisition request, determining a caching result based on the data to be cached by using the first node; under the condition that the caching result is that the data to be cached is cached to the corresponding secondary cache, the first node is utilized to cache the identification information of the data to be cached in the tertiary cache; subscribing the identification information by utilizing nodes which receive the data acquisition request from the plurality of nodes; and under the condition that the second node is utilized to clear the data to be cached, utilizing the nodes subscribed with the identification information in the plurality of nodes to respectively clear the data to be cached in the corresponding second-level caches. According to the technical scheme, in the distributed service cluster, the nodes with the second-level caches comprising the same cached data subscribe the same identification information in the third-level caches, so that when any node clears the cached data, other nodes subscribing the identification information can clear the same data corresponding to the second-level caches in time, and the consistency of the data caches is improved.

Drawings

Fig. 1 is a schematic flow chart of a data caching method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating an exemplary node data cache according to an embodiment of the present invention;

FIG. 3 is a flow chart of an exemplary data acquisition according to an embodiment of the present invention;

FIG. 4 is a flow chart of an exemplary task process provided by an embodiment of the present invention;

FIG. 5 is a flow chart illustrating exemplary local cache persistence provided by an embodiment of the present invention;

FIG. 6 is a schematic diagram of exemplary subscription identification information provided by an embodiment of the present invention;

FIG. 7 is a schematic diagram of an exemplary process for clearing cache data according to an embodiment of the present invention;

FIG. 8 is a second exemplary flow chart for clearing cache data according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a data buffering device according to an embodiment of the present invention;

fig. 10 is a schematic diagram of a data buffering device according to an embodiment of the present invention.

Detailed Description

The technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the specific embodiments described herein are merely illustrative of the application and not limiting of the application. It should be noted that, for convenience of description, only a portion related to the related application is shown in the drawings.

The invention provides a data caching method which is applied to a distributed service cluster, wherein the distributed service cluster comprises a plurality of nodes, each node in the plurality of nodes respectively comprises a corresponding primary cache and a corresponding secondary cache, the plurality of nodes also share the same tertiary cache and a database, the data caching method is realized by a data caching device, and fig. 1 is a flow diagram of the data caching method provided by the embodiment of the invention. As shown in fig. 1, the method mainly comprises the following steps:

s101, under the condition that data to be cached is acquired by a first node based on a data acquisition request, determining a caching result based on the data to be cached by the first node; the first node is any one of a plurality of nodes, and data to be cached is obtained from a corresponding primary cache, secondary cache, tertiary cache or database.

In the embodiment of the invention, when the data caching device acquires the data to be cached based on the data acquisition request by using the first node, determining a caching result based on the data to be cached by using the first node; the first node is any one of a plurality of nodes, and data to be cached is obtained from a corresponding primary cache, secondary cache, tertiary cache or database.

It should be noted that, in the embodiment of the present invention, the distributed service cluster includes a plurality of nodes, and each node in the plurality of nodes includes a corresponding primary cache and a secondary cache, and in the prior art, the implementation of the primary cache and the secondary cache corresponding to each node may be implemented based on Mybatis as an object relational mapping (Object Relational Mapping, ORM) framework of the operation database. In the invention, the primary cache is a permanent cache (PerpetialCache) type local cache (localCache) of a basic executor (BaseExuteor), is basically a simple lookup table (map), is relatively original, has low configurability and does not carry out memory protection, but is a cache belonging to a session level, and the primary cache is not occupied by long-term saturation because the session can be closed in time, so the primary cache is not expanded; in the second-level cache, the cache is cross-session, so that the cache needs to perform better in terms of cache hit rate, and meanwhile, the cache is used as a local cache and needs better performance, so that the invention adds a local third-party cache library Caffeine implementation on the basis of a Mybatis framework; in addition, the original secondary cache in the Mybatis framework can be selectively used, and the original primary cache and the secondary cache are only different in data range, so that the original primary cache and the secondary cache in the Mybatis framework are both set as the primary cache, and the added local third-party cache library Caffeine is set as the secondary cache.

Of these, caffeine is a high-performance cache based on Version 1.8 of the object-oriented programming language (JAVA 1.8 Version). Memory caching provided by Caffeine uses an application program interface (Application Program Interface, API) referenced Google guava. Caffeine is an empirically improved outcome based on Google Guava Cache design. Caffeine has the following characteristics: the data can be automatically loaded into the cache, or asynchronous loading can be selected; when the Caffeine reaches maximum capacity based on frequency and most recently accessed Caffeine will switch to size based mode; time-based expiration of the entry measured since the last access or last write; when the outdated request of the first item occurs, the Caffeine performs asynchronous refreshing; keywords (keys) are automatically included in the weak references; the value (value) is automatically contained in the weak reference or the soft reference; notification will be received when the data is evicted (or otherwise deleted); the written data will be propagated to external resources; the accumulated cache is recorded the number of times it is requested.

It should be noted that in the embodiment of the present invention, the same three-level cache and database are also shared between multiple nodes, and the three-level cache may be, for example, a remote dictionary service (Remote Dictionary Server, redis), where Redis is one of non-relational databases (NoSQL), and is an open source written using ANSIC, including multiple data structures, a support network, and a memory-based, optionally persistent key-value pair storage database. Redis uses memory as data storage medium, so the data reading and writing efficiency is extremely high and far exceeds the database. Taking a 256 byte string as an example, it can be read up to 110000 times/s and written up to 81000 times/s. Redis differs from a distributed cache system (memcache) in that the data stored in Redis is persistent and is not lost after power down or reboot. Because the storage of the Redis is divided into three parts, namely memory storage, disk storage and log (log) files, after restarting, the Redis can reload data from the disk into the memory, and the data can be configured through the configuration files, so that the Redis can only realize persistence. Wherein, redis also adopts a caching strategy based on time invalidation, and adds invalidation time to the key. The buffer failure time can ensure that the fault diffusion surface is reduced when the buffer is inconsistent.

Fig. 2 is a schematic structural diagram of an exemplary node data cache according to an embodiment of the present invention. As shown in fig. 2, the distributed service cluster includes a plurality of nodes, each node including a corresponding primary cache, secondary cache, tertiary cache, and database. The first-level cache comprises a Local cache and a global cache, wherein a cache Executor (cache), an Executor (Executor) and a Local cache (Local cache) are arranged in the Local cache, and the global cache is provided with a mapper; the second-level cache can be a local third-party cache library Caffeine/EHCache; the tertiary cache may be a distributed cache rediss.

It should be noted that, in the embodiment of the present invention, the data processing apparatus may receive the data acquisition request by using each of the plurality of nodes, where, when the data processing apparatus receives the data acquisition request by using the first node of the plurality of nodes, the data processing apparatus may sequentially perform data searching from the corresponding primary cache, the secondary cache, and the tertiary cache, and the database based on the data acquisition request, if the data to be cached is acquired in the database, it is described that the data is not cached in the primary cache, the secondary cache, and the tertiary cache, after the database query is performed, the data to be cached is acquired in the database, and after the data to be cached is acquired, the data caching apparatus may determine the cache result based on the data to be cached by using the first node.

It should be noted that, in the embodiment of the present invention, the primary cache, the secondary cache, and the tertiary cache may be understood as a database storing key-value key value pairs; the value is the data to be cached, that is, the corresponding cache data is obtained based on the identification information carried by the data obtaining request.

Specifically, in an embodiment of the present invention, a data caching apparatus obtains data to be cached based on a data obtaining request by using a first node, including: under the condition that a data acquisition request is acquired by using a first node, searching cache data matched with the data acquisition request from a corresponding first-level cache by using the first node; under the condition that the first node does not find the cache data matched with the data acquisition request from the corresponding first-level cache, the first node is utilized to find the cache data matched with the data acquisition request from the corresponding second-level cache; under the condition that the first node does not find the cache data matched with the data acquisition request from the corresponding second-level cache, the first node is utilized to find the cache data matched with the data acquisition request from the third-level cache; under the condition that the first node does not find the cache data matched with the data acquisition request from the three-level cache, the first node is utilized to find the data matched with the data acquisition request from the database, and the found data is determined to be the data to be cached.

It should be noted that, in the embodiment of the present invention, when the data caching device obtains the data obtaining request by using the first node, the first node is used to first search the cache data matched with the data obtaining request from the corresponding first-level cache, if the identification information (key) carried in the data obtaining request can find the corresponding cache data (value) in the corresponding first-level cache, the cache data can be directly returned, that is, hit; if the cache data is not found in the corresponding first-level cache, namely the cache data is not hit, further searching the cache data matched with the data acquisition request from the corresponding second-level cache, and if the key carried in the data acquisition request can find the corresponding value in the corresponding second-level cache, directly returning the value; if the cache data is not found in the corresponding second-level cache, further searching the cache data matched with the data acquisition request from the third-level cache redis, and if the key carried in the data acquisition request can find the corresponding value in the corresponding third-level cache, directly returning the data to be cached; if the cache data is not found in the corresponding three-level cache, further searching the data from the database based on the data acquisition request, and determining the searched data as the data to be cached.

Fig. 3 is a schematic diagram of an exemplary data acquisition process according to an embodiment of the present invention. As shown in fig. 3, the data caching device may acquire identification information of data to be cached by using a first node, then generate a data acquisition request based on the acquired identification information, and further search a value matched with the identification information key carried by the data acquisition request at a first node corresponding to the first node, and if the value hits in the first cache, take the hit value as the data to be cached; if the first-level cache is not hit, continuing to search the value matched with the identification information key from the second cache; if the data is hit in the second-level cache, taking the hit value as the data to be cached, and updating the time of the request; if the cache is not hit in the second-level cache, continuing to search the value matched with the identification information key from the third cache redis; if hit in the three-level cache redis, taking the hit value as data to be cached; if the data is not hit in the third-level cache redis, continuing to search the value matched with the identification information key from the database, namely executing a structured query language (Structured Query Language, SQL) query, and further storing the matched value as data to be cached into the second-level cache or the third-level cache redis.

Specifically, in an embodiment of the present invention, a data caching device searches for data matching with a data acquisition request from a database, and determines the searched data as data to be cached, including: generating a target data acquisition task based on the data acquisition request, and splitting the target data acquisition task into a plurality of data acquisition subtasks; determining a target node for executing each subtask in a plurality of data acquisition subtasks from a plurality of nodes to obtain a plurality of target nodes corresponding to the plurality of data acquisition subtasks one by one; respectively executing a plurality of data acquisition subtasks by using a plurality of target nodes to obtain a plurality of task results corresponding to the plurality of data acquisition subtasks one by one; and carrying out data aggregation on the task results, and determining the aggregated data as data to be cached.

It should be noted that, in the embodiment of the present invention, the process of the data processing apparatus searching the database for the data matching the data obtaining request may be a process of executing the sql query. In conventional programming, concurrent programming is performed on the code in the service layer to improve the execution efficiency of the code, and processing of the data access object (Data Access Object, DAO) layer is completed by an ORM framework, and a black box is inside the ORM framework. In the traditional Mybatis framework, the single thread execution is performed mainly because the steps inside the framework have a strong front-back dependency relationship, so the framework is not suitable for concurrency or the input and output caused by concurrency are low. After careful analysis of the internal execution flow of Mybatis, it can be seen that the bottleneck in executing SQL operations internally is the process of interacting with the database server. Thus, the data processing apparatus processes the data acquisition request using the fragmentation mechanism.

It should be noted that, in the embodiment of the present invention, for the data acquisition request that misses in the third-level cache, the data caching device performs a real query operation, that is, searches the database for the data matching the data acquisition request; at this time, the data processing apparatus may generate a target data acquisition task based on the data acquisition request, and then split the target data acquisition task into a plurality of data acquisition subtasks according to the configuration of the service configuration file, where each subtask is a slice, and the task items should be completely independent from the service level, for example, query two SQL with id 1 and id 2. And determining a target node for executing each of the plurality of data acquisition subtasks from the plurality of nodes to obtain a plurality of target nodes corresponding to the plurality of data acquisition subtasks one by one, respectively executing the plurality of data acquisition subtasks by using the plurality of target nodes to obtain a plurality of task results corresponding to the plurality of data acquisition subtasks one by one, finally, carrying out data aggregation on the plurality of task results, and determining the aggregated data as data to be cached.

Fig. 4 is a schematic flow chart of an exemplary task execution according to an embodiment of the present invention. As shown in fig. 4, a flow discussion of task execution is performed by taking 3 nodes as an example: illustratively, querying data in the range of id in (1, 3, 8), the node receiving the data acquisition request (assuming node a receives the request) automatically becomes the master node, takes the role of a registry, and the other nodes in the distributed service cluster become slaves; at this time, the master node divides the query id in (1, 3, 8) into 3 slices, with id=1, id=3, and id=8 each being one slice. The 3 fragments can fall onto three nodes (node A, node B and node C) to execute operations, which are executed in parallel, and after the three nodes execute to obtain the result, the real result is returned to the main node (node A) to perform data encapsulation (aggregation), and meanwhile, whether the data are stored in the corresponding secondary cache or the tertiary cache is determined according to the size of the data.

Specifically, in an embodiment of the present invention, a data caching apparatus determines a caching result based on data to be cached, including: acquiring data characteristic information of data to be cached; the data characteristic information at least comprises one or more of the data volume of the data to be cached and the number of data requests; under the condition that the importance degree of the data represented by the data characteristic information is larger than or equal to a first preset degree threshold value, determining a caching result as a second-level cache corresponding to the data to be cached to the first node; and under the condition that the importance degree of the data represented by the data characteristic information is smaller than a first preset degree threshold value, determining a caching result to be caching the data to be cached to three levels of caches.

It should be noted that, in the embodiment of the present invention, after the data caching device obtains the data to be cached from the database, the data caching device may obtain the data characteristic information of the data to be cached, because the primary cache and the secondary cache both belong to the local cache, the memory space is limited, and the primary cache can clear the local cache while closing the session, so that the situation that the cache occupies the memory for a long time does not occur. However, in the second-level cache, the session is crossed, when the data volume is larger, the memory pressure is larger in the local cache, so that a first preset degree threshold value can be set according to the data importance degree represented by the data characteristic information, if the data importance degree represented by the data characteristic information exceeds the set first preset degree threshold value, the second-level cache is directly accessed, if the data importance degree represented by the data characteristic information does not reach the first preset degree threshold value, the second-level cache is accessed, the specific preset degree threshold value can be set according to the actual requirements and application situations, and the first preset degree threshold value can be dynamically adjusted according to the service characteristics of the data acquisition request, and thermal updating is realized. Exemplary: the data caching device can set that the smaller the data quantity is, the larger the importance degree of the data represented by the data to be cached is, and the larger the data quantity is, the smaller the importance degree of the data represented by the data to be cached is; the more the number of data requests is, the larger the data importance degree of the data representation to be cached is, the less the number of data requests is, so that the data with the larger data importance degree is cached in the secondary cache of the local cache, the data to be cached can be directly obtained from the local secondary cache under the condition that the same data obtaining request is received next time, network input/output (IO) delay is avoided, and the data with large data quantity is cached in the remote tertiary cache, so that the space pressure of the secondary cache is reduced. The self-adaptive data caching mode according to the data characteristic information of the data to be cached can reduce the pressure of the local memory better and improve the data acquisition efficiency.

Specifically, in the embodiment of the present invention, after the data caching device determines that the caching result is the second level cache corresponding to the first node, the following steps may be further executed: writing the data to be cached into a local file system of the first node under the condition that the importance degree of the data represented by the data characteristic information is larger than a second preset degree threshold; under the condition that the first node is restarted, reading data to be cached from a local file system; and under the condition that the first node clears the data to be cached in the corresponding secondary cache, clearing the data to be cached from the local file system.

It should be noted that, in the embodiment of the present invention, since the first level cache and the second level cache are used as memory cache blocks, the biggest characteristic is that the cache blocks are fast, but the cache blocks stay in the memory, so after the node is restarted, the cache data will be lost; at this time, data searching is performed on all data acquisition requests from the database, so that the pressure of the database is suddenly increased, and the data interface is slow in response and can collapse the whole database. Therefore, the data processing device considers whether the importance degree of the data represented by the data characteristic information of the data to be cached is larger than a second preset degree threshold at the moment, and the second preset degree threshold can be set according to actual requirements and application scenes; for example, whether the identification information (key) of the data to be cached needs to be locally persistent can be judged according to the configuration in the node configuration file, and if the data needs to be persistent, a higher importance level can be set; or the data caching device can set an importance value for the data to be cached according to the number of times of requests of the data to be cached; the more the number of data requests is, the greater the importance degree of the data to be cached is, and the less the number of data requests is, the smaller the importance degree of the data to be cached is; therefore, for important data, namely, the importance degree of the data represented by the data characteristic information of the data to be cached is larger than a second preset degree threshold value, the data caching device can use the first node to write the data to be cached into the local file system of the first node; the local file system is a local disk, and the implementation steps of writing the data to be cached into the local file system may be: by submitting the caching task to the queue and performing asynchronous processing by the loop protocol, a file which is the same as the node name can be established to cache the locally persistent data when the node is started for the first time, and then the data to be cached can be read from the local file system after the node is restarted.

FIG. 5 is a flow chart illustrating exemplary local cache persistence provided by an embodiment of the present invention. As shown in fig. 5, when the data processing apparatus locally caches data, the data processing apparatus stores important data in a local file system according to service configuration, and executes the important data through a loop protocol. When the node is restarted, the data in the local file system is read for data initialization.

It should be noted that, in the embodiment of the present invention, when the data processing apparatus uses the first node to clear the data to be cached in the corresponding secondary cache, the data processing apparatus uses the first node to clear the data to be cached from the local file system, so as to avoid dirty data in the local file system, the data processing apparatus may use the first node to add the data to be cleared into the task queue, and a special circulation protocol is provided to delete the data from the local file system, so as to ensure that the expired data in the persistent file can be deleted in time. An exemplary implementation of data purging is shown in fig. 5: the method comprises the steps of taking tasks from a task queue, deleting data in a local file system, adopting a coroutine to process, firstly, the coroutine does not influence the flow of a main thread, so that an interface is not dragged, backup and clearing operations are not coupled with business relations, and the backoffice is completely available in the background, secondly, the coroutine does not influence the engineering state, if the coroutine is a conventional thread, after a main line Cheng Dangji (down), the engineering is not down because the conventional thread is still in, at the moment, the engineering has a problem, but the engineering down cannot be found in time, the problem can be found, and the coroutine has the problem.

Specifically, in the embodiment of the present invention, after the data caching device determines the caching result based on the data to be cached, the following steps may be further performed: and setting the data expiration time of the data to be cached based on the data characteristic information of the data to be cached under the condition that the data to be cached is cached based on the caching result.

It should be noted that, in the embodiment of the present invention, the data caching device selects a time-based cache manner in the eviction cache policy; an exemplary implementation expire After Access (long, timeUnit), i.e., begins to count after the last access or write, expires after a specified time.

It should be noted that, in the embodiment of the present invention, when the data to be cached is cached based on the caching result, the data caching device sets the data expiration time of the data to be cached based on the data characteristic information, and for most of the scenarios, the data expiration time is consistent with the characteristic that the data retention time which is most frequently used is long, and the data retention time which is least used is short, that is, the expiration time determined based on the number of data requests, and when the expiration time is reached, the data processing device clears the data cached in the secondary cache, and at this time, if the portion of data is also stored in the local file system, the corresponding deletion is also performed.

And S102, under the condition that the caching result is that the data to be cached is cached to the corresponding second-level cache, the first node is utilized to cache the identification information of the data to be cached in the third-level cache.

In the embodiment of the invention, the data caching device utilizes the first node to cache the identification information of the data to be cached in the three-level cache under the condition that the caching result is that the data to be cached is cached in the corresponding second-level cache.

It should be noted that, in the embodiment of the present invention, when the data caching device caches the data to be cached to the corresponding second level cache by using the caching result determined by the first node, the first node is used to cache the identification information value of the data to be cached in the third level cache; then, when the data caching device receives the same data acquisition request by using a plurality of nodes, the identification information value of one piece of data to be cached is cached in the three-level cache, because the identification information value of the data to be cached is the same, when the identification information value of the data to be cached is cached in the three-level cache by the second node, the identification information cached by the last node is covered, and thus, only one identification information is stored in the redis.

S103, subscribing the identification information by utilizing the nodes which receive the data acquisition request from the plurality of nodes.

In the embodiment of the invention, the data caching device subscribes to the identification information respectively by utilizing the nodes which receive the data acquisition request from the plurality of nodes.

It should be noted that, in the embodiment of the present invention, after the data processing apparatus stores a key value in the third level cache as the identifier of the second level cache, the node that receives the data acquisition request in the plurality of nodes is used to subscribe to the identifier information key, i.e. monitor the key item.

Fig. 6 is a schematic diagram of exemplary subscription identification information provided in an embodiment of the present invention. As shown in fig. 6, the data processing apparatus subscribes to the identification information in the three-level cache redis using each of the nodes that received the data acquisition request among the plurality of nodes.

Specifically, in the embodiment of the present invention, after subscribing the identification information respectively by using the nodes that receive the data acquisition request from the plurality of nodes, the data caching apparatus may further execute the following steps: when the communication service subscribing the identification information in the three-level cache is abnormal and the second node clears the data to be cached, notifying nodes which are different from the second node in the plurality of nodes by using message buses of the plurality of nodes; and respectively clearing the data to be cached in the corresponding secondary cache by utilizing nodes which are different from the second node in the plurality of nodes.

It should be noted that, in the embodiment of the present invention, in an actual situation, an abnormality may occur in a communication service subscribing to identification information in the third-level cache, so that there is an inconsistency between the second-level caches corresponding to the plurality of nodes, and at this time, the data caching apparatus starts a bus mechanism, that is, uses the message buses of the plurality of nodes to notify other nodes in the plurality of nodes to clear data to be cached in the corresponding second-level caches, so as to ensure that the data cached in the second-level caches corresponding to the respective nodes maintain consistency.

It should be noted that, in the embodiment of the present invention, when the data caching apparatus clears the secondary cache by using one node, the data caching apparatus may notify other nodes by using the node to issue a message, and after the other nodes receive the message from the message bus, clear the local secondary cache. Illustratively, each distributed node is connected by a lightweight message broker, when used in a change in broadcast status, such as a configuration change, or other message instruction; the core idea is to extend the spring boot application through a distributed starter, and the spring boot application can also be used for establishing a communication channel among a plurality of applications.

Fig. 7 is a schematic diagram of an exemplary process for clearing cache data according to an embodiment of the present invention. As shown in fig. 7, the data processing apparatus performs interface triggering (curl triggering) on any node, and performs cache cleaning, for example, node a (clientA), at which point the clientA cleans up the local cache and sends a command to the message bus, and node B (clientB) and node C (clientC) receive the broadcast in the message bus and start to clean up the local secondary cache in the respective nodes.

S104, under the condition that the second node is utilized to clear the data to be cached, utilizing the nodes subscribed with the identification information in the plurality of nodes to respectively clear the data to be cached in the corresponding second-level caches; the second node is any one of the plurality of nodes subscribed to the identification information.

In the embodiment of the invention, the data caching device respectively clears the data to be cached in the corresponding secondary cache by utilizing the nodes subscribed with the identification information in the plurality of nodes under the condition that the second node is utilized to clear the data to be cached; the second node is any one of the plurality of nodes subscribed to the identification information.

It should be noted that, in the embodiment of the present invention, after the data caching device utilizes the node that receives the data acquisition request from the plurality of nodes and subscribes to the key of the identification information, if any node that subscribes to the identification information clears the data to be cached, and simultaneously clears the identification information stored in the tertiary cache, because other nodes that subscribe to the identification information are listening to the identification information, other nodes that subscribe to the identification information may receive a command for clearing the data to be cached, so as to clear the data to be cached in the corresponding secondary cache respectively.

Fig. 8 is a second schematic flow chart of clearing cache data according to an embodiment of the present invention. As shown in fig. 8, node a (clientA), node B (clientB), and node C (clientC) subscribe to the identification information key in the third-level cache redis, and if the data to be cached of the clientA is invalid at this time and the local cache is cleared, the third-level cache redis is notified to clear the identification information key, so that other nodes subscribing to the identification information: the client b and the client c may receive a command for clearing the data to be cached, so as to clear the data to be cached in the corresponding secondary cache respectively.

The invention provides a data caching method, which is applied to a distributed service cluster, wherein the distributed service cluster comprises a plurality of nodes, each node in the plurality of nodes respectively comprises a corresponding primary cache and a corresponding secondary cache, and the plurality of nodes also share the same tertiary cache and database, and the method comprises the following steps: under the condition that the first node acquires the data to be cached based on the data acquisition request, determining a caching result based on the data to be cached by using the first node; under the condition that the caching result is that the data to be cached is cached to the corresponding secondary cache, the first node is utilized to cache the identification information of the data to be cached in the tertiary cache; subscribing the identification information by utilizing nodes which receive the data acquisition request from the plurality of nodes; and under the condition that the second node is utilized to clear the data to be cached, utilizing the nodes subscribed with the identification information in the plurality of nodes to respectively clear the data to be cached in the corresponding second-level caches. According to the data caching method, in the distributed service cluster, the nodes of the second-level cache comprising the same cached data subscribe the same identification information in the third-level cache, so that when any node clears the cached data, other nodes subscribing the identification information can clear the same data corresponding to the second-level cache in time, and the consistency of the data cache is improved.

The invention provides a data caching device which is applied to a distributed service cluster, wherein the distributed service cluster comprises a plurality of nodes, each node in the plurality of nodes respectively comprises a corresponding primary cache and a corresponding secondary cache, the plurality of nodes also share the same tertiary cache and a database, and fig. 9 is a schematic structural diagram of the data caching device provided by the embodiment of the invention. As shown in fig. 9, includes:

a determining module 901, configured to determine, with a first node, a cache result based on data to be cached when the first node obtains the data to be cached based on a data obtaining request; the first node is any node in the plurality of nodes, and the data to be cached is obtained from the corresponding primary cache, secondary cache, tertiary cache or database;

a caching module 902, configured to cache, when the caching result is that the data to be cached is cached to a corresponding second level cache, identification information of the data to be cached in the third level cache by using the first node;

a subscription module 903, configured to subscribe the identification information by using a node that receives the data acquisition request from the plurality of nodes;

A clearing module 904, configured to, when clearing the data to be cached by using the second node, clear the data to be cached in the corresponding second level cache by using a node subscribed to the identification information in the plurality of nodes; the second node is any node subscribed to the identification information in the plurality of nodes.

In an embodiment of the present invention, the data caching apparatus further includes a searching module (not shown in the figure), configured to, when the data acquisition request is acquired by using the first node, search, by using the first node, cache data matching the data acquisition request from a corresponding level one cache; under the condition that the first node does not find the cache data matched with the data acquisition request from the corresponding first-level cache, the first node is utilized to find the cache data matched with the data acquisition request from the corresponding second-level cache; under the condition that the first node does not find the cache data matched with the data acquisition request from the corresponding second-level cache, the first node is utilized to find the cache data matched with the data acquisition request from the third-level cache; and under the condition that the first node does not find the cache data matched with the data acquisition request from the three-level cache, the first node is utilized to find the data matched with the data acquisition request from the database, and the found data is determined to be the data to be cached.

In an embodiment of the present invention, the lookup module (not shown in the figure) is further configured to generate a target data acquisition task based on the data acquisition request, and split the target data acquisition task into a plurality of data acquisition subtasks; determining target nodes for executing each subtask in the plurality of data acquisition subtasks from the plurality of nodes to obtain a plurality of target nodes corresponding to the plurality of data acquisition subtasks one by one; respectively executing the plurality of data acquisition subtasks by using the plurality of target nodes to obtain a plurality of task results corresponding to the plurality of data acquisition subtasks one by one; and carrying out data aggregation on the task results, and determining the aggregated data as the data to be cached.

In an embodiment of the present invention, the determining module 901 is further configured to obtain data characteristic information of the data to be cached; the data characteristic information at least comprises one or more of the data size of the data to be cached and the number of data requests; under the condition that the importance degree of the data represented by the data characteristic information is larger than or equal to a first preset degree threshold value, determining the caching result as a second-level cache corresponding to the data to be cached to the first node; and under the condition that the importance degree of the data represented by the data characteristic information is smaller than the first preset degree threshold value, determining the caching result to be caching the data to be cached to the three-level cache.

In an embodiment of the present invention, the data caching apparatus further includes a persistence module (not shown in the figure), configured to write the data to be cached into a local file system of the first node, where the importance degree of the data represented by the data characteristic information is greater than a second preset degree threshold; under the condition that the first node is restarted, reading the data to be cached from the local file system; and under the condition that the first node clears the data to be cached in the corresponding secondary cache, clearing the data to be cached from the local file system.

In an embodiment of the present invention, the data caching apparatus further includes a setting module (not shown in the figure) configured to set, when the data to be cached is cached based on the caching result, a data expiration time of the data to be cached based on the data characteristic information of the data to be cached.

In an embodiment of the present invention, the data caching apparatus further includes a notification module (not shown in the figure), configured to notify, when an abnormality occurs in a communication service subscribed to the identification information in the third-level cache and the second node clears the data to be cached, a node different from the second node among the plurality of nodes by using a message bus of the plurality of nodes; and respectively clearing the data to be cached in the corresponding secondary cache by utilizing nodes which are different from the second node in the plurality of nodes.

The invention provides a data caching device, and fig. 10 is a schematic diagram of a structure of the data caching device according to an embodiment of the invention. As shown in fig. 10, the data caching apparatus includes: a processor 1001, a memory 1002, and a communication bus 1003;

the communication bus 1003 is used for realizing communication connection between the processor 1001 and the memory 1002;

the processor 1001 is configured to execute a computer program stored in the memory 1002 to implement the data caching method described above.

The invention provides a data caching device, which sequentially searches data from corresponding first-level cache, second-level cache and third-level cache and a database by utilizing a first node in a plurality of nodes based on a data acquisition request until a data to be cached is acquired in the database, and determines a caching result based on the data to be cached by utilizing the first node; under the condition that the caching result determined by the first node is that the data to be cached is cached to the corresponding second-level cache, the first node is utilized to cache the identification information of the data to be cached in the third-level cache; subscribing the identification information by utilizing nodes which receive the data acquisition request from the plurality of nodes; and under the condition that the second node subscribed with the identification information in the plurality of nodes is utilized to clear the data to be cached, the nodes subscribed with the identification information in the plurality of nodes are utilized to respectively clear the data to be cached in the corresponding secondary cache. In the data caching device provided by the invention, the distributed service cluster is arranged, and the nodes of the second-level cache comprising the same cached data subscribe the same identification information in the third-level cache, so that when any node clears the cached data, other nodes subscribing the identification information can clear the same data corresponding to the second-level cache in time, thereby improving the consistency of the data cache.

The present invention provides a computer readable storage medium storing one or more computer programs executable by one or more processors to implement the above described data caching method. The computer readable storage medium may be a volatile Memory (RAM), such as a Random-Access Memory (RAM); or a nonvolatile Memory (non-volatile Memory), such as a Read-Only Memory (ROM), a flash Memory (flash Memory), a Hard Disk (HDD) or a Solid State Drive (SSD); but may be a respective device, such as a mobile phone, a computer, a tablet device, a personal digital assistant, etc., comprising one or any combination of the above memories.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. The data caching method is characterized by being applied to a distributed service cluster, wherein the distributed service cluster comprises a plurality of nodes, each node in the plurality of nodes respectively comprises a corresponding primary cache and a corresponding secondary cache, and the plurality of nodes also share the same tertiary cache and database, and the method comprises the following steps:

2. The method of claim 1, wherein the acquiring, with the first node, the data to be cached based on the data acquisition request, comprises:

3. The method of claim 2, wherein the searching the database for data matching the data acquisition request and determining the searched data as the data to be cached comprises:

4. The method of claim 1, wherein the determining a cache result based on the data to be cached comprises:

5. The method of claim 4, wherein after determining that the buffering result is to buffer the data to be buffered to the second level buffer corresponding to the first node, the method further comprises:

6. The method of claim 1, wherein after determining a cache result based on the data to be cached, the method further comprises:

7. The method of claim 1, wherein after subscribing to the identification information respectively with the nodes of the plurality of nodes that received the data acquisition request, the method further comprises:

8. The utility model provides a data caching device, is characterized in that is applied to distributed service cluster, distributed service cluster includes a plurality of nodes, each node in a plurality of nodes includes corresponding one-level buffering and second level buffering respectively, and a plurality of nodes still share same tertiary buffering and database, includes:

9. A data caching apparatus, comprising: a processor, a memory, and a communication bus;

the processor configured to execute a computer program stored in the memory to implement the data caching method of any one of claims 1-7.

10. A computer readable storage medium storing one or more computer programs executable by one or more processors to implement the data caching method of any one of claims 1-7.