WO2022016861A1 - Hotspot data caching method and system, and related device - Google Patents

Hotspot data caching method and system, and related device Download PDF

Info

Publication number
WO2022016861A1
WO2022016861A1 PCT/CN2021/076978 CN2021076978W WO2022016861A1 WO 2022016861 A1 WO2022016861 A1 WO 2022016861A1 CN 2021076978 W CN2021076978 W CN 2021076978W WO 2022016861 A1 WO2022016861 A1 WO 2022016861A1
Authority
WO
WIPO (PCT)
Prior art keywords
hotspot
data
queue
hotspot data
caching
Prior art date
Application number
PCT/CN2021/076978
Other languages
French (fr)
Chinese (zh)
Inventor
谢有权
Original Assignee
浪潮电子信息产业股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浪潮电子信息产业股份有限公司 filed Critical 浪潮电子信息产业股份有限公司
Publication of WO2022016861A1 publication Critical patent/WO2022016861A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0625Power saving in storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements

Definitions

  • the present application relates to the field of data processing, and in particular, to a method, system and related device for caching hotspot data.
  • client cache is generally not used. Because there is no communication between clients, when a single client caches data, it cannot perceive that other clients modify this part of the data. , so it will lead to data inconsistency; so the general solution is to establish communication between clients. If the client has a lot of data, it will inevitably lead to the busy network, and at the same time, it will also cause the blocking problem of normal business due to ensuring data consistency, resulting in Degradation of client business performance; in addition, multiple distributed clients may cache duplicate data, resulting in a waste of resources, and the client's resources are limited, which easily affects the normal operation of the client.
  • the purpose of this application is to provide a hotspot data caching method, system, computer-readable storage medium and server, which improve the caching performance in a shared volume scenario.
  • the present application provides a method for caching hotspot data, and the specific technical solutions are as follows:
  • the hotspot data in the hotspot queue is cached to the local storage of the cluster.
  • judging whether the request data is hotspot data according to the current number of visits includes:
  • Whether the requested data is hot data is determined by using the least recently used policy according to the current number of visits.
  • the method further includes:
  • the tail of the hotspot queue is the hotspot data with the lowest hotspot degree.
  • adding the hotspot data to the hotspot queue includes:
  • the hotspot flag of each hotspot data in the aging queue is changed to a hotspot aging flag.
  • the method After changing the hotspot mark of each hotspot data in the aging queue to the hotspot aging mark, the method further includes:
  • the preset formula is:
  • D tt 1
  • is the aging parameter
  • D is the time access interval
  • t is the current time
  • t 1 is the hot spot marking time of the hot spot data
  • is the preset attenuation factor
  • T is the unit period.
  • the method further includes:
  • the first part of the data and the second part of the data are combined and returned to the requester of the read operation.
  • caching the hotspot data in the hotspot queue to the cluster local storage includes:
  • the hotspot data and the corresponding block object ID are cached to the local storage of the cluster.
  • the application also provides a hotspot data caching system, including:
  • a receiving module configured to receive read and write requests, and confirm that the read and write requests correspond to request data
  • a judgment module used for adding one to the access times of the requested data, and judging whether the requested data is hot data according to the current access times
  • a hotspot marking module for adding the hotspot data to the hotspot queue when the judgment result of the judgment module is yes;
  • the cache module is configured to add a hotspot mark to the hotspot data in the hotspot queue, and cache the hotspot data in the hotspot queue to the local storage of the cluster.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the above-described method.
  • the present application also provides a server, including a memory and a processor, wherein a computer program is stored in the memory, and the processor implements the steps of the above method when the computer program in the memory is invoked.
  • the present application provides a method for caching hotspot data, which includes: receiving a read/write request, and confirming that the read/write request corresponds to the request data; adding one to the access times of the request data, and determining whether the request data is based on the current access times Hotspot data; if yes, add the hotspot data to the hotspot queue, and add a hotspot mark to the hotspot data in the hotspot queue; cache the hotspot data in the hotspot queue to the local storage of the cluster.
  • This application uses the client to identify the hotspot data, and saves the hotspot data to the cluster local storage at the bottom of the distributed storage cluster through the hotspot queue.
  • the local storage engine at the bottom of the cluster ensures the consistency of the cached data of each client. It is responsible for the resources occupied by hot data, reduces the resource consumption caused by the client caching hot data, improves the service performance of the client, ensures that each client enjoys a consistent cache, and further improves the cache performance under the distributed shared storage system.
  • the present application also provides a method, system, computer-readable storage medium and server for caching hotspot data, which have the above beneficial effects, and will not be repeated here.
  • FIG. 1 is a flowchart of a method for caching hotspot data provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a hotspot data caching system provided by an embodiment of the present application.
  • FIG. 1 is a flowchart of a method for caching hotspot data provided by an embodiment of the present application.
  • the specific technical solution is as follows:
  • S101 Receive a read/write request, and confirm that the read/write request corresponds to request data
  • the client receives the read and write requests, and confirms the corresponding request data.
  • the corresponding object can be found according to the offset and length of the read operation.
  • S102 Add one to the access times of the requested data, and determine whether the requested data is hotspot data according to the current access times; if so, go to S103;
  • the number of accesses to the requested data is increased at this time, and it is determined whether the requested data satisfies the hotspot data condition, that is, whether it becomes hotspot data.
  • the hotspot data condition that is, whether it becomes hotspot data.
  • the so-called Least Recently Used strategy is the LRU (Least Recently Used) strategy, which is essentially a cache elimination strategy, mainly based on the frequency of data access to determine whether it is hot data.
  • LRU Least Recently Used
  • S103 adding the hotspot data to a hotspot queue, and adding a hotspot mark to the hotspot data in the hotspot queue;
  • the hotspot data is added to the hotspot queue, and a hotspot mark is added to the hotspot data to identify that the data has become hotspot data.
  • the hotspot queue can be a storage queue in the storage device, or it can only be used as a collection of hotspot data without an actual storage structure.
  • the hotspot queue can be added to the hotspot data as an attribute to show the hotspot data. Added to hotspot queue.
  • the hotspot data after the hotspot data is added to the hotspot queue, it can also be sorted according to the hotspot degree of each of the hotspot data in the hotspot queue, so that the queue tail of the hotspot queue is the lowest hotspot degree hotspot data.
  • the hotness degree of hotspot data can generally be determined according to the access frequency of each hotspot data in a unit time, and the unit time may be a preset period, such as one day, one week, or one month.
  • the hotspot degree can be further determined according to the last access time, that is, the closer the last access time is to the current time, the higher the hotspot degree.
  • S104 Cache the hotspot data in the hotspot queue to local storage in the cluster.
  • This step is designed to cache hotspot data to cluster local storage. It should be noted that in this step, the client sends the hotspot queue to the master node of the distributed storage cluster by default, and then the master node caches the hotspot data in the hotspot queue to the local storage of the cluster.
  • the cluster local storage is located at the bottom layer of the distributed storage cluster and can be accessed by all clients in the cluster. At this time, cache consistency between clients can be achieved. Specifically, it is also necessary to distinguish hotspot data and non-hotspot data according to the hotspot mark in step S103.
  • each client can determine the corresponding hotspot data according to the read and write status of its own data, and feed it back to the hotspot queue in the cluster.
  • the hotspot data There is no specific limitation on how to cache the hotspot data here, you can first confirm the block object ID where the hotspot data including the hotspot mark is located, and then cache the hotspot data and the corresponding block object ID to the local storage of the cluster.
  • the shared volume in the distributed storage cluster is divided, that is, divided into several block objects according to the preset size, and any data has its own block object. By dividing the shared volume, it is convenient to improve the search efficiency of hot data in the cache, which is equivalent to establishing an index for each data.
  • the block object ID where the hotspot data is located may be determined first, and the corresponding block object ID is stored in the cluster local storage synchronously when the hotspot data is cached. It should be noted that all block objects are not directly cached. Since the block object may contain non-hot data, only the hot data in the block object is saved when caching is performed.
  • this embodiment can be executed once every time the client receives a read/write request. If the requested data has already been hotspot data before the number of accesses is increased by one, then the increase of the number of visits at this time means the request. The hotspot corresponding to the data increases, and at this time, its position in the hotspot queue can be closer to the head of the team. And when the requested data is already hot data, you can directly retrieve the corresponding hot data from the local storage of the cluster and reply to the request.
  • the client identifies hotspot data, saves the hotspot data to the cluster local storage at the bottom of the distributed storage cluster through the hotspot queue, and the local storage engine at the bottom of the cluster ensures the consistency of the cached data of each client.
  • the storage cluster is responsible for the resources occupied by hot data, which reduces the resource consumption caused by the client's caching of hot data, improves the service performance of the client, ensures that each client enjoys a consistent cache, and further improves the cache performance under the distributed shared storage system. .
  • the following steps may also be performed:
  • the hotspot data at the end of the hotspot queue is moved to the aging queue, and the hotspot mark of each hotspot data in the aging queue is changed to a hotspot aging mark.
  • the hotspot data at the end of the hotspot queue is removed and added to the aging queue.
  • the default hotspot queue has been sorted according to the hotspot degree of each hotspot data, that is, the hotspot data at the end of the queue is the hotspot data with the lowest current hotspot degree.
  • the access frequency of the new hotspot data and the hotspot data at the end of the queue in the hotspot queue is usually the same or similar. However, since the last access time of the new hotspot data is obviously shorter than that of the hotspot data already in the hotspot queue, it can be unconditionally. Perform hot spot data removal at the end of the queue.
  • the aging queue is designed to carry the replaced hotspot data, which does not mean that the data in the aging queue no longer belongs to the hotspot data. Therefore, cache removal is not required for all data in the aging queue. It is further determined whether the data including the hotspot aging flag really needs to be removed from the cache.
  • S203 Determine whether the difference between the aging parameter of the hotspot data and the number of visits per unit period is less than the rejection threshold; if not, go to S204;
  • the preset formula is:
  • D tt 1
  • is the aging parameter
  • D is the time access interval
  • t is the current time
  • t 1 is the marking time of the hot spot marking
  • is the preset decay factor
  • T is the unit period.
  • the rejection threshold and preset attenuation factor are not specifically limited here, and can be set by those skilled in the art according to actual rejection requirements.
  • the execution processes of step S201 and step S202 are independent of each other, and there is no predetermined execution sequence, and only needs to be completed between the execution of the judgment process of S203. It is easy to understand that the above process is only a detailed process for removing hotspot data from the aging queue provided in this embodiment, and those skilled in the art can make any improvements to the above process, which should fall within the protection scope of the present application.
  • the execution subject of this embodiment does not have to be the client. Since the hotspot queue is held by the server in the distributed storage cluster, the hotspot data aging process in this embodiment can also be performed by the server. Execution, that is, reducing the hot data processing pressure of the client, while avoiding repeated operations between different clients, and reducing the extra waste of resources. At the same time, by setting the aging queue in this embodiment, it can ensure that the hotspot data in the hotspot queue is the current hottest data in the distributed storage cluster, which is beneficial to improve the utilization efficiency of the cluster cache.
  • S301 Determine whether the hotspot data corresponding to the read operation are all located in the local storage of the cluster; if not, go to S302;
  • S302 Obtain the first part of the requested data from the cluster local storage, and obtain the second part of the requested data from the disk;
  • S303 Combine the first part of the data and the second part of the data and return to the requester of the read operation.
  • the actual data for the read operation request of the hot data sent by the client is not necessarily the current distributed data.
  • the cluster in the storage system locally stores the actual cached hot data.
  • part of the data corresponding to the read operation request may have become aging data including hotspot aging marks, or it may be that each block object is large due to the excessively large division granularity when executing shared volume division, so that when hotspot data caching is executed, certain If the hot data of each block object is not completely cached, at this time, it can be obtained according to the location of the actual data requested by the read operation request, that is, the first part of the data cached in the local storage of the cluster is directly read from the cache, while The other part needs to call the disk to perform the corresponding IO operation to obtain the second part of the data, and finally the data returned to the requester should contain the first part of the data and the second part of the data.
  • the hotspot data caching system described below and the hotspot data caching method described above may refer to each other correspondingly.
  • FIG. 2 is a schematic structural diagram of a hotspot data caching system provided by an embodiment of the present application, and the present application also provides a hotspot data caching system, including:
  • a receiving module 100 configured to receive a read/write request, and confirm that the read/write request corresponds to request data
  • Judging module 200 for adding one to the number of visits of the requested data, and judging whether the requested data is hot data according to the current number of visits;
  • a hotspot marking module 300 configured to add the hotspot data to the hotspot queue when the judgment result of the judgment module is yes;
  • the caching module 400 is configured to add a hotspot mark to the hotspot data in the hotspot queue, and cache the hotspot data in the hotspot queue to local storage in the cluster.
  • the judgment module 200 includes:
  • a hotspot sorting module configured to sort according to the hotspot degree of each of the hotspot data in the hotspot queue
  • the tail of the hotspot queue is the hotspot data with the lowest hotspot degree.
  • An aging processing module configured to move the hotspot data at the end of the hotspot queue to the aging queue when the hotspot queue is full, and change the hotspot mark of each hotspot data in the aging queue to the hotspot aging mark.
  • the aging processing module may include:
  • An aging processing unit configured to acquire the number of visits per unit period of the hotspot data; calculate the aging parameter of the hotspot data according to a preset formula; determine whether the difference between the aging parameter of the hotspot data and the number of visits per unit period is less than the number of times to be eliminated threshold; if not, remove the hotspot data from the cluster local storage;
  • the preset formula is:
  • D tt 1
  • is the aging parameter
  • D is the time access interval
  • t is the current time
  • t 1 is the hot spot marking time of the hot spot data
  • is the preset attenuation factor
  • T is the unit period.
  • a data retrieval module configured to determine whether the hotspot data corresponding to the read operation are all located in the cluster local storage; if not, obtain the first part of the requested data from the cluster local storage, and obtain the data from the disk Request the second part of the data; combine the first part of the data and the second part of the data and return it to the requester of the read operation.
  • the cache module 400 includes:
  • a confirmation unit for confirming the block object ID where the hotspot data including the hotspot mark is located
  • a cache unit configured to cache the hotspot data and corresponding block object IDs to local storage in the cluster.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed, the steps provided by the above embodiments can be implemented.
  • the storage medium may include: U disk, removable hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
  • the present application also provides a server, which may include a memory and a processor, where a computer program is stored in the memory, and when the processor invokes the computer program in the memory, the steps provided in the above embodiments can be implemented.
  • a server may also include various network interfaces, power supplies and other components.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A hotspot data caching method and system, a computer readable storage medium, and a server. The method comprises: receiving a read-write request, and confirming request data corresponding to the read-write request (S101); the number of times of access to the request data being increased by one, and determining whether the request data is hotspot data according to the current number of times of access (S102); if yes, adding the hotpot data to a hotspot queue, and adding a hotspot mark to the hotspot data in the hotspot queue (S103); and caching the hotspot data in the hotspot queue to a cluster for local storage (S104). The resource consumption caused by caching the hotspot data by a client is reduced, the service performance of the client is improved, each client is ensured to share the consistent cache, and the cache performance under a distributed shared storage system is further improved.

Description

一种热点数据缓存方法、系统及相关装置A hotspot data caching method, system and related device
本申请要求于2020年07月24日提交中国专利局、申请号为202010724366.9、发明名称为“一种热点数据缓存方法、系统及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202010724366.9 and the invention titled "A Hotspot Data Cache Method, System and Related Apparatus" filed with the China Patent Office on July 24, 2020, the entire contents of which are incorporated by reference in this application.
技术领域technical field
本申请涉及数据处理领域,特别涉及一种热点数据缓存方法、系统及相关装置。The present application relates to the field of data processing, and in particular, to a method, system and related device for caching hotspot data.
背景技术Background technique
随着分布式共享存储系统的快速发展,其性能和安全性越来越关注。在多分布式客户端共享一个卷场景下一般不使用客户端缓存,由于客户端之间不存在通信,导致单个客户端缓存数据的时候,无法感知到其他客户端时候对该部分数据进行修改操作,所以会导致数据不一致;所以一般的解决方法是客户端之间建立通信,如果客户端的数据较多,必将导致网络的繁忙,同时也会因保证数据一致性造成正常业务的阻塞问题,导致客户端业务性能的下降;此外多分布式客户端可能缓存重复的数据,造成资源的浪费,而且客户端的资源本身有限,容易影响客户端的正常运行。With the rapid development of distributed shared storage systems, more and more attention is paid to its performance and security. In the scenario where multiple distributed clients share a volume, client cache is generally not used. Because there is no communication between clients, when a single client caches data, it cannot perceive that other clients modify this part of the data. , so it will lead to data inconsistency; so the general solution is to establish communication between clients. If the client has a lot of data, it will inevitably lead to the busy network, and at the same time, it will also cause the blocking problem of normal business due to ensuring data consistency, resulting in Degradation of client business performance; in addition, multiple distributed clients may cache duplicate data, resulting in a waste of resources, and the client's resources are limited, which easily affects the normal operation of the client.
因此,如何实现分布式共享存储系统下的有效缓存是本领域技术人员亟需解决的技术问题。Therefore, how to implement an effective cache in a distributed shared storage system is a technical problem that needs to be solved urgently by those skilled in the art.
发明内容SUMMARY OF THE INVENTION
本申请的目的是提供一种热点数据缓存方法、系统、计算机可读存储介质和服务器,提高了在共享卷场景下的缓存性能。The purpose of this application is to provide a hotspot data caching method, system, computer-readable storage medium and server, which improve the caching performance in a shared volume scenario.
为解决上述技术问题,本申请提供一种热点数据缓存方法,具体技术方案如下:In order to solve the above-mentioned technical problems, the present application provides a method for caching hotspot data, and the specific technical solutions are as follows:
接收读写请求,并确认所述读写请求对应请求数据;Receive a read/write request, and confirm that the read/write request corresponds to the request data;
所述请求数据的访问次数加一,并根据当前访问次数判断所述请求数据是否为热点数据;Add one to the number of visits of the requested data, and determine whether the requested data is hot data according to the current number of visits;
若是,将所述热点数据添加至热点队列,并为所述热点队列中的热点数据添加热点标记;If so, add the hotspot data to the hotspot queue, and add a hotspot mark to the hotspot data in the hotspot queue;
将所述热点队列中的热点数据缓存至集群本地存储。The hotspot data in the hotspot queue is cached to the local storage of the cluster.
可选的,根据当前访问次数判断所述请求数据是否为热点数据包括:Optionally, judging whether the request data is hotspot data according to the current number of visits includes:
根据当前访问次数利用最近最少使用策略判断所述请求数据是否为热点数据。Whether the requested data is hot data is determined by using the least recently used policy according to the current number of visits.
可选的,将所述热点数据添加至热点队列之后,还包括:Optionally, after adding the hotspot data to the hotspot queue, the method further includes:
根据所述热点队列中各所述热点数据的热点度进行排序;Sorting according to the hotspot degree of each of the hotspot data in the hotspot queue;
其中,所述热点队列的队尾为所述热点度最低的热点数据。The tail of the hotspot queue is the hotspot data with the lowest hotspot degree.
可选的,当所述热点队列满载,将所述热点数据添加至热点队列时包括:Optionally, when the hotspot queue is full, adding the hotspot data to the hotspot queue includes:
将所述热点队列队尾的热点数据移至老化队列后,将所述热点数据添至所述热点队列;After moving the hotspot data at the end of the hotspot queue to the aging queue, add the hotspot data to the hotspot queue;
将所述老化队列中各热点数据的热点标记变更为热点老化标记。The hotspot flag of each hotspot data in the aging queue is changed to a hotspot aging flag.
并将所述老化队列中各热点数据的热点标记变更为热点老化标记之后,还包括:After changing the hotspot mark of each hotspot data in the aging queue to the hotspot aging mark, the method further includes:
获取所述热点数据单位周期内的访问次数;Obtain the number of visits in the hotspot data unit period;
根据预设公式计算所述热点数据的老化参数;Calculate the aging parameter of the hotspot data according to a preset formula;
判断所述热点数据的老化参数与单位周期内访问次数之差是否小于剔除阈值;Judging whether the difference between the aging parameter of the hotspot data and the number of visits per unit period is less than the rejection threshold;
若否,则将所述热点数据从所述集群本地存储中移除;If not, removing the hotspot data from the cluster local storage;
其中,所述预设公式为:Wherein, the preset formula is:
Figure PCTCN2021076978-appb-000001
其中,D=t-t 1,β为老化参数,D为时间访问间隔,t为当前时间,t 1为所述热点数据的热点标记时间,δ为预设衰减因子,T为单位周期。
Figure PCTCN2021076978-appb-000001
Wherein, D=tt 1 , β is the aging parameter, D is the time access interval, t is the current time, t 1 is the hot spot marking time of the hot spot data, δ is the preset attenuation factor, and T is the unit period.
可选的,当接收到热点数据的读操作请求时,还包括:Optionally, when a read operation request for hotspot data is received, the method further includes:
判断所述读操作对应的热点数据是否均位于所述集群本地存储中;Determine whether the hotspot data corresponding to the read operation are all located in the local storage of the cluster;
若否,从所述集群本地存储中获取所述请求数据的第一部分数据,从磁盘获取所述请求数据的第二部分数据;If not, obtain the first part of the requested data from the cluster local storage, and obtain the second part of the requested data from the disk;
将所述第一部分数据和所述第二部分数据合并后返回至所述读操作的请求方。The first part of the data and the second part of the data are combined and returned to the requester of the read operation.
可选的,将所述热点队列中的热点数据缓存至集群本地存储包括:Optionally, caching the hotspot data in the hotspot queue to the cluster local storage includes:
确认包括所述热点标记的热点数据所在的块对象ID;Confirm the block object ID where the hotspot data including the hotspot mark is located;
将所述热点数据和对应的块对象ID缓存至集群本地存储。The hotspot data and the corresponding block object ID are cached to the local storage of the cluster.
本申请还提供一种热点数据缓存系统,包括:The application also provides a hotspot data caching system, including:
接收模块,用于接收读写请求,并确认所述读写请求对应请求数据;a receiving module, configured to receive read and write requests, and confirm that the read and write requests correspond to request data;
判断模块,用于所述请求数据的访问次数加一,并根据当前访问次数判断所述请求数据是否为热点数据;a judgment module, used for adding one to the access times of the requested data, and judging whether the requested data is hot data according to the current access times;
热点标记模块,用于所述判断模块的判断结果为是时,将所述热点数据添加至热点队列;A hotspot marking module, for adding the hotspot data to the hotspot queue when the judgment result of the judgment module is yes;
缓存模块,用于为所述热点队列中的热点数据添加热点标记,将所述热点队列中的热点数据缓存至集群本地存储。The cache module is configured to add a hotspot mark to the hotspot data in the hotspot queue, and cache the hotspot data in the hotspot queue to the local storage of the cluster.
本申请还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的方法的步骤。The present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the above-described method.
本申请还提供一种服务器,包括存储器和处理器,所述存储器中存有计算机程序,所述处理器调用所述存储器中的计算机程序时实现如上所述的方法的步骤。The present application also provides a server, including a memory and a processor, wherein a computer program is stored in the memory, and the processor implements the steps of the above method when the computer program in the memory is invoked.
本申请提供一种热点数据缓存方法,包括:接收读写请求,并确认所述读写请求对应请求数据;所述请求数据的访问次数加一,并根据当前访问次数判断所述请求数据是否为热点数据;若是,将所述热点数据添加至热点队列,并为所述热点队列中的热点数据添加热点标记;将所述热点队列中的热点数据缓存至集群本地存储。The present application provides a method for caching hotspot data, which includes: receiving a read/write request, and confirming that the read/write request corresponds to the request data; adding one to the access times of the request data, and determining whether the request data is based on the current access times Hotspot data; if yes, add the hotspot data to the hotspot queue, and add a hotspot mark to the hotspot data in the hotspot queue; cache the hotspot data in the hotspot queue to the local storage of the cluster.
本申请利用客户端识别出热点数据,将热点数据通过热点队列保存至分布式存储集群底层的集群本地存储,由集群底层本地存储引擎保证各客户端缓存数据的一致性,同时由分布式存储集群负责热点数据所占用的资 源,降低了客户端缓存热点数据带来的资源消耗,提高客户端的服务性能,确保各客户端享用一致性缓存,进一步提高了分布式共享存储系统下的缓存性能。This application uses the client to identify the hotspot data, and saves the hotspot data to the cluster local storage at the bottom of the distributed storage cluster through the hotspot queue. The local storage engine at the bottom of the cluster ensures the consistency of the cached data of each client. It is responsible for the resources occupied by hot data, reduces the resource consumption caused by the client caching hot data, improves the service performance of the client, ensures that each client enjoys a consistent cache, and further improves the cache performance under the distributed shared storage system.
本申请还提供一种热点数据缓存方法、系统、计算机可读存储介质和服务器,具有上述有益效果,此处不再赘述。The present application also provides a method, system, computer-readable storage medium and server for caching hotspot data, which have the above beneficial effects, and will not be repeated here.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings required for the description of the embodiments or the prior art. Obviously, the drawings in the following description are only It is an embodiment of the present application. For those of ordinary skill in the art, other drawings can also be obtained according to the provided drawings without any creative effort.
图1为本申请实施例所提供的一种热点数据缓存方法的流程图;1 is a flowchart of a method for caching hotspot data provided by an embodiment of the present application;
图2为本申请实施例所提供的一种热点数据缓存系统结构示意图。FIG. 2 is a schematic structural diagram of a hotspot data caching system provided by an embodiment of the present application.
具体实施方式detailed description
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
请参考图1,图1为本申请实施例所提供的一种热点数据缓存方法的流程图,具体技术方案如下:Please refer to FIG. 1. FIG. 1 is a flowchart of a method for caching hotspot data provided by an embodiment of the present application. The specific technical solution is as follows:
S101:接收读写请求,并确认所述读写请求对应请求数据;S101: Receive a read/write request, and confirm that the read/write request corresponds to request data;
本步骤中,客户端接收读写请求,并确认对应的请求数据。具体的,可以根据读操作的偏移和长度找到对应的对象。In this step, the client receives the read and write requests, and confirms the corresponding request data. Specifically, the corresponding object can be found according to the offset and length of the read operation.
S102:所述请求数据的访问次数加一,并根据当前访问次数判断所述请求数据是否为热点数据;若是,进入S103;S102: Add one to the access times of the requested data, and determine whether the requested data is hotspot data according to the current access times; if so, go to S103;
由于读写请求针对访问请求数据,此时请求数据的访问次数增加,并判断该请求数据是否满足热点数据条件,即是否成为热点数据。在此对于 如何判断请求数据是否为热点数据不作具体限定,可以根据当前访问次数利用最近最少使用策略判断所述请求数据是否为热点数据。所谓最近最少策略即LRU(Least Recently Used,最近最少使用)策略,其实质为一种缓存淘汰策略,主要根据数据的访问频率判定其是否为热点数据。当然本领域技术人员还可以采用其他方式判断请求数据是否为热点数据,在此不一一举例限定。Since the read and write requests are directed to access request data, the number of accesses to the requested data is increased at this time, and it is determined whether the requested data satisfies the hotspot data condition, that is, whether it becomes hotspot data. There is no specific limitation on how to determine whether the requested data is hotspot data, and it can be determined whether the requested data is hotspot data by using the least recently used policy according to the current number of visits. The so-called Least Recently Used strategy is the LRU (Least Recently Used) strategy, which is essentially a cache elimination strategy, mainly based on the frequency of data access to determine whether it is hot data. Of course, those skilled in the art can also use other ways to determine whether the request data is hot data, which is not limited by examples here.
S103:将所述热点数据添加至热点队列,并为所述热点队列中的热点数据添加热点标记;S103: adding the hotspot data to a hotspot queue, and adding a hotspot mark to the hotspot data in the hotspot queue;
若请求数据为热点数据,将该热点数据添加至热点队列,为热点数据添加热点标记,用于标识该数据已经成为热点数据。需要注意的是,热点队列可以为存储设备中的一条存储队列,也可以仅作为热点数据所在的集合,而不具备实际存储结构,例如可以将热点队列作为属性添加至热点数据,以示热点数据已添加至热点队列。If the requested data is hotspot data, the hotspot data is added to the hotspot queue, and a hotspot mark is added to the hotspot data to identify that the data has become hotspot data. It should be noted that the hotspot queue can be a storage queue in the storage device, or it can only be used as a collection of hotspot data without an actual storage structure. For example, the hotspot queue can be added to the hotspot data as an attribute to show the hotspot data. Added to hotspot queue.
作为本步骤的一种优选的执行方式,在将热点数据添加至热点队列后,还可以根据所述热点队列中各所述热点数据的热点度进行排序,使得热点队列的队尾为热点度最低的热点数据。在此对于如何计算热点数据的热点度不做具体限定,通常可以根据各热点数据在单位时间内的访问频率确定,该单位时间可以为预设周期,例如一天、一周或一个月等。此外,若存在单位时间内的访问频率相同的热点数据,可以进一步根据最后访问时间确定热点度,即最后访问时间距当前时间越近,其热点度越高。As a preferred implementation manner of this step, after the hotspot data is added to the hotspot queue, it can also be sorted according to the hotspot degree of each of the hotspot data in the hotspot queue, so that the queue tail of the hotspot queue is the lowest hotspot degree hotspot data. There is no specific limitation on how to calculate the hotness degree of hotspot data, which can generally be determined according to the access frequency of each hotspot data in a unit time, and the unit time may be a preset period, such as one day, one week, or one month. In addition, if there is hotspot data with the same access frequency per unit time, the hotspot degree can be further determined according to the last access time, that is, the closer the last access time is to the current time, the higher the hotspot degree.
S104:将所述热点队列中的热点数据缓存至集群本地存储。S104: Cache the hotspot data in the hotspot queue to local storage in the cluster.
本步骤旨在将热点数据缓存至集群本地存储。需要注意的是,本步骤默认由客户端将热点队列发送至分布式存储集群的主节点,再由主节点将热点队列中的热点数据缓存至集群本地存储。集群本地存储位于分布式存储集群的底层,可以被集群中的所有客户端所访问,则此时可以实现各客户端之间的缓存一致性。具体的,还需要根据步骤S103中的热点标记区分热点数据和非热点数据。This step is designed to cache hotspot data to cluster local storage. It should be noted that in this step, the client sends the hotspot queue to the master node of the distributed storage cluster by default, and then the master node caches the hotspot data in the hotspot queue to the local storage of the cluster. The cluster local storage is located at the bottom layer of the distributed storage cluster and can be accessed by all clients in the cluster. At this time, cache consistency between clients can be achieved. Specifically, it is also necessary to distinguish hotspot data and non-hotspot data according to the hotspot mark in step S103.
需要注意的是,整个分布式存储集群中只需存在一个热点队列,而每个客户端都可以根据自身数据的读写状态确定相应的热点数据,并反馈至 集群中的热点队列。It should be noted that only one hotspot queue exists in the entire distributed storage cluster, and each client can determine the corresponding hotspot data according to the read and write status of its own data, and feed it back to the hotspot queue in the cluster.
在此对于如何缓存热点数据不作具体限定,可以先确认包括热点标记的热点数据所在的块对象ID,再将热点数据和对应的块对象ID缓存至集群本地存储。为了便于管理,对分布式存储集群中的共享卷进行划分,即按照预设大小均分成若干块对象,则任何一个数据都有所属的块对象。通过对共享卷进行划分,便于提高缓存中热点数据的查找效率,即相当于为各数据建立索引。则在执行本步骤时,可以先确定热点数据所在的块对象ID,在缓存热点数据时同步将对应的块对象ID存储至集群本地存储。需要注意的是,并不直接将块对象全部缓存,由于块对象中可能包含非热点数据,因此在执行缓存时,仅保存块对象中的热点数据。There is no specific limitation on how to cache the hotspot data here, you can first confirm the block object ID where the hotspot data including the hotspot mark is located, and then cache the hotspot data and the corresponding block object ID to the local storage of the cluster. In order to facilitate management, the shared volume in the distributed storage cluster is divided, that is, divided into several block objects according to the preset size, and any data has its own block object. By dividing the shared volume, it is convenient to improve the search efficiency of hot data in the cache, which is equivalent to establishing an index for each data. Then, when this step is performed, the block object ID where the hotspot data is located may be determined first, and the corresponding block object ID is stored in the cluster local storage synchronously when the hotspot data is cached. It should be noted that all block objects are not directly cached. Since the block object may contain non-hot data, only the hot data in the block object is saved when caching is performed.
容易理解的是,本实施例可以在客户端每接收到一次读写请求时执行一次,若请求数据已经在执行访问次数加一前已经为热点数据,则此时访问次数加一意味着该请求数据对应的热点度上升,此时其位于热点队列中位置可以更加接近队首。且请求数据已经为热点数据时,可以直接从集群本地存储中调取相应的热点数据并回复请求。It is easy to understand that this embodiment can be executed once every time the client receives a read/write request. If the requested data has already been hotspot data before the number of accesses is increased by one, then the increase of the number of visits at this time means the request. The hotspot corresponding to the data increases, and at this time, its position in the hotspot queue can be closer to the head of the team. And when the requested data is already hot data, you can directly retrieve the corresponding hot data from the local storage of the cluster and reply to the request.
本申请实施例利用客户端识别出热点数据,将热点数据通过热点队列保存至分布式存储集群底层的集群本地存储,由集群底层本地存储引擎保证各客户端缓存数据的一致性,同时由分布式存储集群负责热点数据所占用的资源,降低了客户端缓存热点数据带来的资源消耗,提高客户端的服务性能,确保各客户端享用一致性缓存,进一步提高了分布式共享存储系统下的缓存性能。In this embodiment of the present application, the client identifies hotspot data, saves the hotspot data to the cluster local storage at the bottom of the distributed storage cluster through the hotspot queue, and the local storage engine at the bottom of the cluster ensures the consistency of the cached data of each client. The storage cluster is responsible for the resources occupied by hot data, which reduces the resource consumption caused by the client's caching of hot data, improves the service performance of the client, ensures that each client enjoys a consistent cache, and further improves the cache performance under the distributed shared storage system. .
基于上述实施例,作为优选的实施例,为了保证热点数据的数量相对稳定,即热点数据不能无限制增加,为此当热点队列满载时,还可以执行如下步骤:Based on the above embodiment, as a preferred embodiment, in order to ensure that the amount of hotspot data is relatively stable, that is, hotspot data cannot be increased indefinitely, when the hotspot queue is fully loaded, the following steps may also be performed:
将热点队列队尾的热点数据移至老化队列,并将所述老化队列中各热点数据的热点标记变更为热点老化标记。The hotspot data at the end of the hotspot queue is moved to the aging queue, and the hotspot mark of each hotspot data in the aging queue is changed to a hotspot aging mark.
当热点队列满载时,若此时存在新的热点数据,将热点队列队尾的热点数据移除,并添加至老化队列。当然,此时默认热点队列已经按照各热 点数据的热点度进行排序,即队尾的热点数据为当前热点度最低的热点数据。需要注意的是,通常新的热点数据与热点队列中队尾的热点数据的访问频率相同或者相近,但由于新的热点数据其最后访问时间显然短于已经位于热点队列中的热点数据,因此可以无条件执行队尾的热点数据移除。When the hotspot queue is full, if there is new hotspot data at this time, the hotspot data at the end of the hotspot queue is removed and added to the aging queue. Of course, at this time, the default hotspot queue has been sorted according to the hotspot degree of each hotspot data, that is, the hotspot data at the end of the queue is the hotspot data with the lowest current hotspot degree. It should be noted that the access frequency of the new hotspot data and the hotspot data at the end of the queue in the hotspot queue is usually the same or similar. However, since the last access time of the new hotspot data is obviously shorter than that of the hotspot data already in the hotspot queue, it can be unconditionally. Perform hot spot data removal at the end of the queue.
需要注意的是,老化队列旨在承载被替换掉的热点数据,并不意味着位于老化队列中的数据已经不再属于热点数据,因此并非针对老化队列中的所有数据进行缓存移除,还需要进一步判断包括热点老化标记的数据是否真的需要被移出缓存。It should be noted that the aging queue is designed to carry the replaced hotspot data, which does not mean that the data in the aging queue no longer belongs to the hotspot data. Therefore, cache removal is not required for all data in the aging queue. It is further determined whether the data including the hotspot aging flag really needs to be removed from the cache.
在此对于如何执行热点数据移除不作具体限定,作为本实施例的一种优选的执行方式,可以执行如下步骤:There is no specific limitation on how to perform hotspot data removal. As a preferred implementation manner of this embodiment, the following steps may be performed:
S201:获取热点数据单位周期内的访问次数;S201: Obtain the number of visits of the hotspot data in a unit period;
S202:根据预设公式计算热点数据的老化参数;S202: Calculate aging parameters of hotspot data according to a preset formula;
S203:判断热点数据的老化参数与单位周期内访问次数之差是否小于剔除阈值;若否,进入S204;S203: Determine whether the difference between the aging parameter of the hotspot data and the number of visits per unit period is less than the rejection threshold; if not, go to S204;
S204:则将热点数据从集群本地存储中移除;S204: remove the hotspot data from the cluster local storage;
其中,预设公式为:Among them, the preset formula is:
Figure PCTCN2021076978-appb-000002
其中,D=t-t 1,β为老化参数,D为时间访问间隔,t为当前时间,t 1为热点标记的标记时间,δ为预设衰减因子,T为单位周期。
Figure PCTCN2021076978-appb-000002
Wherein, D=tt 1 , β is the aging parameter, D is the time access interval, t is the current time, t 1 is the marking time of the hot spot marking, δ is the preset decay factor, and T is the unit period.
在此对于剔除阈值和预设衰减因子均不作具体限定,可以由本领域技术人员根据实际剔除需求设定。此外,步骤S201和步骤S202的执行过程相互独立,并无既定的执行顺序,只需在执行S203的判断过程之间完成即可。容易理解的是,上述过程仅为本实施例提供的一种剔除老化队列中热点数据的详细过程,本领域技术人员还可以针对上述过程做任何改进,均应在本申请的保护范围内。The rejection threshold and preset attenuation factor are not specifically limited here, and can be set by those skilled in the art according to actual rejection requirements. In addition, the execution processes of step S201 and step S202 are independent of each other, and there is no predetermined execution sequence, and only needs to be completed between the execution of the judgment process of S203. It is easy to understand that the above process is only a detailed process for removing hotspot data from the aging queue provided in this embodiment, and those skilled in the art can make any improvements to the above process, which should fall within the protection scope of the present application.
需要注意的是,本实施例的执行主体并不必须为客户端,由于热点队列为分布式存储集群中服务端所持有,因此,执行本实施例中热点数据老化的过程也可以由服务端执行,即降低客户端的热点数据处理压力,同时避免不同客户端之间的重复操作,减少了资源的额外浪费。同时,本实施 例通过设置老化队列,能够确保热点队列中的热点数据为分布式存储集群中当下的最热数据,有利于提高集群缓存利用效率。It should be noted that the execution subject of this embodiment does not have to be the client. Since the hotspot queue is held by the server in the distributed storage cluster, the hotspot data aging process in this embodiment can also be performed by the server. Execution, that is, reducing the hot data processing pressure of the client, while avoiding repeated operations between different clients, and reducing the extra waste of resources. At the same time, by setting the aging queue in this embodiment, it can ensure that the hotspot data in the hotspot queue is the current hottest data in the distributed storage cluster, which is beneficial to improve the utilization efficiency of the cluster cache.
基于上述实施例,作为优选的实施例,当接收到热点数据的读操作请求时,还可以包括如下步骤:Based on the above embodiment, as a preferred embodiment, when a read operation request for hotspot data is received, the following steps may also be included:
S301:判断读操作对应的热点数据是否均位于集群本地存储中;若否,进入S302;S301: Determine whether the hotspot data corresponding to the read operation are all located in the local storage of the cluster; if not, go to S302;
S302:从集群本地存储中获取请求数据的第一部分数据,从磁盘获取请求数据的第二部分数据;S302: Obtain the first part of the requested data from the cluster local storage, and obtain the second part of the requested data from the disk;
S303:将第一部分数据和第二部分数据合并后返回至读操作的请求方。S303: Combine the first part of the data and the second part of the data and return to the requester of the read operation.
容易理解的是,由于服务端和客户端之间存在消息滞后性,或者由于数据状态更新过快,即客户端所发出的热点数据的读操作请求实际针对的数据并不一定均为当前分布式存储系统中集群本地存储实际缓存的热点数据。即读操作请求所对应的部分数据可能已经成为包括热点老化标记的老化数据,也可能由于执行共享卷划分时的划分粒度过大,导致每个块对象较大,使得执行热点数据缓存时,某个块对象的热点数据并未完全缓存等情况,此时可以根据读操作请求所请求的实际数据所在位置进行获取,即于集群本地存储中缓存的第一部分数据则直接从缓存中读取,而另一部分则需要调用磁盘执行相应的IO操作以获取第二部分数据,最后返回至请求方的数据应包含第一部分数据和第二部分数据。It is easy to understand that due to the message lag between the server and the client, or because the data status is updated too quickly, that is, the actual data for the read operation request of the hot data sent by the client is not necessarily the current distributed data. The cluster in the storage system locally stores the actual cached hot data. That is, part of the data corresponding to the read operation request may have become aging data including hotspot aging marks, or it may be that each block object is large due to the excessively large division granularity when executing shared volume division, so that when hotspot data caching is executed, certain If the hot data of each block object is not completely cached, at this time, it can be obtained according to the location of the actual data requested by the read operation request, that is, the first part of the data cached in the local storage of the cluster is directly read from the cache, while The other part needs to call the disk to perform the corresponding IO operation to obtain the second part of the data, and finally the data returned to the requester should contain the first part of the data and the second part of the data.
下面对本申请实施例提供的一种热点数据缓存系统进行介绍,下文描述的热点数据缓存系统与上文描述的一种热点数据缓存方法可相互对应参照。The following describes a hotspot data caching system provided by an embodiment of the present application. The hotspot data caching system described below and the hotspot data caching method described above may refer to each other correspondingly.
参见图2,图2为本申请实施例所提供的一种热点数据缓存系统结构示意图,本申请还提供一种热点数据缓存系统,包括:Referring to FIG. 2, FIG. 2 is a schematic structural diagram of a hotspot data caching system provided by an embodiment of the present application, and the present application also provides a hotspot data caching system, including:
接收模块100,用于接收读写请求,并确认所述读写请求对应请求数据;A receiving module 100, configured to receive a read/write request, and confirm that the read/write request corresponds to request data;
判断模块200,用于所述请求数据的访问次数加一,并根据当前访问 次数判断所述请求数据是否为热点数据;Judging module 200, for adding one to the number of visits of the requested data, and judging whether the requested data is hot data according to the current number of visits;
热点标记模块300,用于所述判断模块的判断结果为是时,将所述热点数据添加至热点队列;A hotspot marking module 300, configured to add the hotspot data to the hotspot queue when the judgment result of the judgment module is yes;
缓存模块400,用于为所述热点队列中的热点数据添加热点标记,将所述热点队列中的热点数据缓存至集群本地存储。The caching module 400 is configured to add a hotspot mark to the hotspot data in the hotspot queue, and cache the hotspot data in the hotspot queue to local storage in the cluster.
基于上述实施例,作为优选的实施例,判断模块200包括:Based on the above embodiment, as a preferred embodiment, the judgment module 200 includes:
判断单元,用于根据当前访问次数利用最近最少使用策略判断所述请求数据是否为热点数据。A judging unit for judging whether the request data is hotspot data by using the least recently used policy according to the current access times.
基于上述实施例,作为优选的实施例,还可以包括:Based on the above embodiment, as a preferred embodiment, it can also include:
热点排序模块,用于根据所述热点队列中各所述热点数据的热点度进行排序;a hotspot sorting module, configured to sort according to the hotspot degree of each of the hotspot data in the hotspot queue;
其中,所述热点队列的队尾为所述热点度最低的热点数据。The tail of the hotspot queue is the hotspot data with the lowest hotspot degree.
基于上述实施例,作为优选的实施例,还可以包括:Based on the above embodiment, as a preferred embodiment, it can also include:
老化处理模块,用于当所述热点队列满载时,将所述热点队列队尾的热点数据移至老化队列,并将所述老化队列中各热点数据的热点标记变更为热点老化标记。An aging processing module, configured to move the hotspot data at the end of the hotspot queue to the aging queue when the hotspot queue is full, and change the hotspot mark of each hotspot data in the aging queue to the hotspot aging mark.
进一步的,该老化处理模块可以包括:Further, the aging processing module may include:
老化处理单元,用于获取所述热点数据单位周期内的访问次数;根据预设公式计算所述热点数据的老化参数;判断所述热点数据的老化参数与单位周期内访问次数之差是否小于剔除阈值;若否,则将所述热点数据从所述集群本地存储中移除;An aging processing unit, configured to acquire the number of visits per unit period of the hotspot data; calculate the aging parameter of the hotspot data according to a preset formula; determine whether the difference between the aging parameter of the hotspot data and the number of visits per unit period is less than the number of times to be eliminated threshold; if not, remove the hotspot data from the cluster local storage;
其中,所述预设公式为:Wherein, the preset formula is:
Figure PCTCN2021076978-appb-000003
其中,D=t-t 1,β为老化参数,D为时间访问间隔,t为当前时间,t 1为所述热点数据的热点标记时间,δ为预设衰减因子,T为单位周期。
Figure PCTCN2021076978-appb-000003
Wherein, D=tt 1 , β is the aging parameter, D is the time access interval, t is the current time, t 1 is the hot spot marking time of the hot spot data, δ is the preset attenuation factor, and T is the unit period.
基于上述实施例,作为优选的实施例,还包括:Based on the above embodiment, as a preferred embodiment, it also includes:
数据检索模块,用于判断所述读操作对应的热点数据是否均位于所述 集群本地存储中;若否,从所述集群本地存储中获取所述请求数据的第一部分数据,从磁盘获取所述请求数据的第二部分数据;将所述第一部分数据和所述第二部分数据合并后返回至所述读操作的请求方。A data retrieval module, configured to determine whether the hotspot data corresponding to the read operation are all located in the cluster local storage; if not, obtain the first part of the requested data from the cluster local storage, and obtain the data from the disk Request the second part of the data; combine the first part of the data and the second part of the data and return it to the requester of the read operation.
基于上述实施例,作为优选的实施例,缓存模块400包括:Based on the above embodiment, as a preferred embodiment, the cache module 400 includes:
确认单元,用于确认包括所述热点标记的热点数据所在的块对象ID;a confirmation unit for confirming the block object ID where the hotspot data including the hotspot mark is located;
缓存单元,用于将所述热点数据和对应的块对象ID缓存至集群本地存储。A cache unit, configured to cache the hotspot data and corresponding block object IDs to local storage in the cluster.
本申请还提供了一种计算机可读存储介质,其上存有计算机程序,该计算机程序被执行时可以实现上述实施例所提供的步骤。该存储介质可以包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed, the steps provided by the above embodiments can be implemented. The storage medium may include: U disk, removable hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
本申请还提供了一种服务器,可以包括存储器和处理器,所述存储器中存有计算机程序,所述处理器调用所述存储器中的计算机程序时,可以实现上述实施例所提供的步骤。当然所述服务器还可以包括各种网络接口,电源等组件。The present application also provides a server, which may include a memory and a processor, where a computer program is stored in the memory, and when the processor invokes the computer program in the memory, the steps provided in the above embodiments can be implemented. Of course, the server may also include various network interfaces, power supplies and other components.
说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例提供的系统而言,由于其与实施例提供的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same and similar parts between the various embodiments can be referred to each other. For the system provided by the embodiment, since it corresponds to the method provided by the embodiment, the description is relatively simple, and the relevant part can be referred to the description of the method.
本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。Specific examples are used herein to illustrate the principles and implementations of the present application, and the descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application. It should be pointed out that for those of ordinary skill in the art, without departing from the principles of the present application, several improvements and modifications can also be made to the present application, and these improvements and modifications also fall within the protection scope of the claims of the present application.
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语 仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should also be noted that, in this specification, relational terms such as first and second are used only to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities or operations. There is no such actual relationship or sequence between operations. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion such that a process, method, article or device comprising a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

Claims (10)

  1. 一种热点数据缓存方法,其特征在于,包括:A method for caching hotspot data, comprising:
    接收读写请求,并确认所述读写请求对应请求数据;Receive a read/write request, and confirm that the read/write request corresponds to the request data;
    所述请求数据的访问次数加一,并根据当前访问次数判断所述请求数据是否为热点数据;Add one to the number of visits of the requested data, and determine whether the requested data is hot data according to the current number of visits;
    若是,将所述热点数据添加至热点队列,并为所述热点队列中的热点数据添加热点标记;If so, add the hotspot data to the hotspot queue, and add a hotspot mark to the hotspot data in the hotspot queue;
    将所述热点队列中的热点数据缓存至集群本地存储。The hotspot data in the hotspot queue is cached to the local storage of the cluster.
  2. 根据权利要求1所述的热点数据缓存方法,其特征在于,根据当前访问次数判断所述请求数据是否为热点数据包括:The hotspot data caching method according to claim 1, wherein judging whether the requested data is hotspot data according to the current number of visits comprises:
    根据当前访问次数利用最近最少使用策略判断所述请求数据是否为热点数据。Whether the requested data is hot data is determined by using the least recently used policy according to the current number of visits.
  3. 根据权利要求1所述的热点数据缓存方法,其特征在于,将所述热点数据添加至热点队列之后,还包括:The method for caching hotspot data according to claim 1, wherein after adding the hotspot data to the hotspot queue, the method further comprises:
    根据所述热点队列中各所述热点数据的热点度进行排序;Sorting according to the hotspot degree of each of the hotspot data in the hotspot queue;
    其中,所述热点队列的队尾为所述热点度最低的热点数据。The tail of the hotspot queue is the hotspot data with the lowest hotspot degree.
  4. 根据权利要求3所述的热点数据缓存方法,其特征在于,当所述热点队列满载,将所述热点数据添加至热点队列时包括:The method for caching hotspot data according to claim 3, wherein when the hotspot queue is full, adding the hotspot data to the hotspot queue comprises:
    将所述热点队列队尾的热点数据移至老化队列后,将所述热点数据添至所述热点队列;After moving the hotspot data at the end of the hotspot queue to the aging queue, add the hotspot data to the hotspot queue;
    将所述老化队列中各热点数据的热点标记变更为热点老化标记。The hotspot flag of each hotspot data in the aging queue is changed to a hotspot aging flag.
  5. 根据权利要求4所述的热点数据缓存方法,其特征在于,并将所述老化队列中各热点数据的热点标记变更为热点老化标记之后,还包括:The method for caching hotspot data according to claim 4, wherein after changing the hotspot mark of each hotspot data in the aging queue to a hotspot aging mark, the method further comprises:
    获取所述热点数据单位周期内的访问次数;Obtain the number of visits in the hotspot data unit period;
    根据预设公式计算所述热点数据的老化参数;Calculate the aging parameter of the hotspot data according to a preset formula;
    判断所述热点数据的老化参数与单位周期内访问次数之差是否小于剔除阈值;Judging whether the difference between the aging parameter of the hotspot data and the number of visits per unit period is less than the rejection threshold;
    若否,则将所述热点数据从所述集群本地存储中移除;If not, removing the hotspot data from the cluster local storage;
    其中,所述预设公式为:Wherein, the preset formula is:
    Figure PCTCN2021076978-appb-100001
    其中,D=t-t 1,β为老化参数,D为时间访问间隔,t为当前时间,t 1为所述热点标记的标记时间,δ为预设衰减因子,T为单位周期。
    Figure PCTCN2021076978-appb-100001
    Wherein, D=tt 1 , β is the aging parameter, D is the time access interval, t is the current time, t 1 is the marking time of the hot spot marking, δ is the preset decay factor, and T is the unit period.
  6. 根据权利要求1所述的热点数据缓存方法,其特征在于,当接收到热点数据的读操作请求时,还包括:The method for caching hotspot data according to claim 1, wherein when receiving a read operation request for hotspot data, the method further comprises:
    判断所述读操作对应的热点数据是否均位于所述集群本地存储中;Determine whether the hotspot data corresponding to the read operation are all located in the local storage of the cluster;
    若否,从所述集群本地存储中获取所述请求数据的第一部分数据,从磁盘获取所述请求数据的第二部分数据;If not, obtain the first part of the requested data from the cluster local storage, and obtain the second part of the requested data from the disk;
    将所述第一部分数据和所述第二部分数据合并后返回至所述读操作的请求方。The first part of the data and the second part of the data are combined and returned to the requester of the read operation.
  7. 根据权利要求1所述的热点数据缓存方法,其特征在于,将所述热点队列中的热点数据缓存至集群本地存储包括:The hotspot data caching method according to claim 1, wherein the caching of the hotspot data in the hotspot queue to the cluster local storage comprises:
    确认包括所述热点标记的热点数据所在的块对象ID;Confirm the block object ID where the hotspot data including the hotspot mark is located;
    将所述热点数据和对应的块对象ID缓存至集群本地存储。The hotspot data and the corresponding block object ID are cached to the local storage of the cluster.
  8. 一种热点数据缓存系统,其特征在于,包括:A hotspot data caching system, characterized in that it includes:
    接收模块,用于接收读写请求,并确认所述读写请求对应请求数据;a receiving module, configured to receive a read/write request, and confirm that the read/write request corresponds to the request data;
    判断模块,用于所述请求数据的访问次数加一,并根据当前访问次数判断所述请求数据是否为热点数据;a judgment module, used for adding one to the access times of the requested data, and judging whether the requested data is hot data according to the current access times;
    热点标记模块,用于所述判断模块的判断结果为是时,将所述热点数据添加至热点队列;A hotspot marking module, used for adding the hotspot data to the hotspot queue when the judgment result of the judgment module is yes;
    缓存模块,用于为所述热点队列中的热点数据添加热点标记,将所述热点队列中的热点数据缓存至集群本地存储。The cache module is configured to add a hotspot mark to the hotspot data in the hotspot queue, and cache the hotspot data in the hotspot queue to the local storage of the cluster.
  9. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-7任一项所述的方法的步骤。A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1-7 are implemented.
  10. 一种服务器,其特征在于,包括存储器和处理器,所述存储器中存有计算机程序,所述处理器调用所述存储器中的计算机程序时实现如权利要求1-7任一项所述的方法的步骤。A server, characterized by comprising a memory and a processor, wherein a computer program is stored in the memory, and the processor implements the method according to any one of claims 1-7 when the processor invokes the computer program in the memory A step of.
PCT/CN2021/076978 2020-07-24 2021-02-20 Hotspot data caching method and system, and related device WO2022016861A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010724366.9 2020-07-24
CN202010724366.9A CN111857597A (en) 2020-07-24 2020-07-24 Hot spot data caching method, system and related device

Publications (1)

Publication Number Publication Date
WO2022016861A1 true WO2022016861A1 (en) 2022-01-27

Family

ID=72950077

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/076978 WO2022016861A1 (en) 2020-07-24 2021-02-20 Hotspot data caching method and system, and related device

Country Status (2)

Country Link
CN (1) CN111857597A (en)
WO (1) WO2022016861A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115996227A (en) * 2022-12-23 2023-04-21 中国联合网络通信集团有限公司 Data sharing method, device, system, server and storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111857597A (en) * 2020-07-24 2020-10-30 浪潮电子信息产业股份有限公司 Hot spot data caching method, system and related device
CN112269837A (en) * 2020-11-17 2021-01-26 珠海大横琴科技发展有限公司 Data processing method and device
CN112732190B (en) * 2021-01-07 2023-01-10 苏州浪潮智能科技有限公司 Method, system and medium for optimizing data storage structure
CN113687781A (en) * 2021-07-30 2021-11-23 济南浪潮数据技术有限公司 Method, device, equipment and medium for pulling up thermal data
CN113727128B (en) * 2021-08-31 2023-07-07 上海哔哩哔哩科技有限公司 Hot spot flow processing method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103177005A (en) * 2011-12-21 2013-06-26 深圳市腾讯计算机系统有限公司 Processing method and system of data access
CN109120709A (en) * 2018-09-03 2019-01-01 杭州云创共享网络科技有限公司 A kind of caching method, device, equipment and medium
US20190197027A1 (en) * 2016-12-19 2019-06-27 Tencent Technology (Shenzhen) Company Limited Data management method and server
CN109992597A (en) * 2019-03-11 2019-07-09 福建天泉教育科技有限公司 A kind of storage method and terminal of hot spot data
CN111857597A (en) * 2020-07-24 2020-10-30 浪潮电子信息产业股份有限公司 Hot spot data caching method, system and related device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125247A (en) * 2019-12-06 2020-05-08 北京浪潮数据技术有限公司 Method, device, equipment and storage medium for caching redis client

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103177005A (en) * 2011-12-21 2013-06-26 深圳市腾讯计算机系统有限公司 Processing method and system of data access
US20190197027A1 (en) * 2016-12-19 2019-06-27 Tencent Technology (Shenzhen) Company Limited Data management method and server
CN109120709A (en) * 2018-09-03 2019-01-01 杭州云创共享网络科技有限公司 A kind of caching method, device, equipment and medium
CN109992597A (en) * 2019-03-11 2019-07-09 福建天泉教育科技有限公司 A kind of storage method and terminal of hot spot data
CN111857597A (en) * 2020-07-24 2020-10-30 浪潮电子信息产业股份有限公司 Hot spot data caching method, system and related device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115996227A (en) * 2022-12-23 2023-04-21 中国联合网络通信集团有限公司 Data sharing method, device, system, server and storage medium

Also Published As

Publication number Publication date
CN111857597A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
WO2022016861A1 (en) Hotspot data caching method and system, and related device
US20210011888A1 (en) Intelligent layout of composite data structures in tiered storage with persistent memory
CN101493826B (en) Database system based on WEB application and data management method thereof
US20170116136A1 (en) Reducing data i/o using in-memory data structures
US20090307329A1 (en) Adaptive file placement in a distributed file system
US10235047B2 (en) Memory management method, apparatus, and system
WO2015110046A1 (en) Cache management method and device
US20200081867A1 (en) Independent evictions from datastore accelerator fleet nodes
CN110555001B (en) Data processing method, device, terminal and medium
WO2021093365A1 (en) Gpu video memory management control method and related device
CN107341114B (en) Directory management method, node controller and system
CN105635196A (en) Method and system of file data obtaining, and application server
CN107992270B (en) Method and device for globally sharing cache of multi-control storage system
CN113687781A (en) Method, device, equipment and medium for pulling up thermal data
CN110908965A (en) Object storage management method, device, equipment and storage medium
JP2013156765A (en) Information processing apparatus, distributed processing system, cache management program, and distributed processing method
US20170364442A1 (en) Method for accessing data visitor directory in multi-core system and device
CN101483668A (en) Network storage and access method, device and system for hot spot data
CN111221773B (en) Data storage architecture method based on RDMA high-speed network and skip list
JPH11143779A (en) Paging processing system for virtual storage device
CN109977074B (en) HDFS-based LOB data processing method and device
JP2012018607A (en) Distributed cache system
CN117033831A (en) Client cache method, device and medium thereof
JPH07239808A (en) Distributed data managing system
CN107704596A (en) A kind of method, apparatus and equipment for reading file

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21847038

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21847038

Country of ref document: EP

Kind code of ref document: A1