CN113190169A - Lightweight active overdue data caching method and system - Google Patents

Lightweight active overdue data caching method and system Download PDF

Info

Publication number
CN113190169A
CN113190169A CN202110135675.7A CN202110135675A CN113190169A CN 113190169 A CN113190169 A CN 113190169A CN 202110135675 A CN202110135675 A CN 202110135675A CN 113190169 A CN113190169 A CN 113190169A
Authority
CN
China
Prior art keywords
data
memory
module
storage
disk storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110135675.7A
Other languages
Chinese (zh)
Inventor
解一豪
赵文慧
赵振修
周庆勇
李明明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202110135675.7A priority Critical patent/CN113190169A/en
Publication of CN113190169A publication Critical patent/CN113190169A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • G06F3/0649Lifecycle management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses a lightweight active overdue data caching method and a system, belonging to the field of data storage; the method comprises the following specific steps: s1, reading data by using a disk storage; s2 writing data by using disk storage; s3, adopting polling thread to process lru switching of memory data and deleting of expired data; the cache of the invention is not completely stored in the memory, but respectively stored in the memory and the disk; the memory only stores and accesses more data, and less data is stored in the disk, so that the occupation of the memory is reduced; data switching between the memory and the disk is synchronously performed in an lru + bloom filtering + polling mode, so that cpu consumption is reduced as much as possible; the cache component does not need to be independently deployed, and is embedded in the process of a user in the forms of a dynamic link library, a jar packet and the like, so that extra maintenance cost and network communication cost are avoided.

Description

Lightweight active overdue data caching method and system
Technical Field
The invention discloses a lightweight active overdue data caching method and system, and relates to the technical field of data storage.
Background
Data caching (data caching) is one of the commonly used technologies in the current fields of computers and communications, and is widely applied in scenes such as instant messaging, e-commerce, video live broadcast and the like, and the main point is to preload frequently accessed hot data in a register, a memory or high-performance external storage equipment, so that the access times of an actual storage medium are reduced, and the aims of reducing the response time of the whole service and reducing the pressure of physical storage equipment or a storage server are finally fulfilled.
A common design idea of data caching is to apply for a large amount of partitioned memories in a system heap by using a memory pool (memory pool) mode, and organize data in a hash table form, because the system memory capacity is always limited, part of outdated data often needs to be deleted, and when the memory is full, an lru algorithm is used to delete data which is not used for a long time or place the data into an external cache (redis idea).
Currently, common data caching tools, such as memcache, redis and the like, are generally provided in an independent service form, generally need an independent server for deployment, and have certain requirements on hardware; the common embedded cache is often influenced by the system itself, has limited storage capacity, and depends on the system memory.
Aiming at the problems of large volume, high cpu resource consumption and more dependence on memory in the realization of the current common data cache, the invention changes the realization idea of the traditional data cache and provides a lightweight active overdue data cache method and a system to solve the problems.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a lightweight active overdue data caching method and a lightweight active overdue data caching system, and the adopted technical scheme is as follows: a lightweight active outdated data caching method comprises the following specific steps:
s1, reading data by using a disk storage;
s2 writing data by using disk storage;
s3 employs polling threads to additionally handle lru switching of memory data and deletion of stale data.
The specific step of S1 reading data by using disk storage includes:
s101, inquiring memory storage, and if the memory storage is found, directly returning;
s102, if not found, inquiring the disk storage, and if not found, directly returning;
s103, if the data is found in the disk storage part, synchronizing to the memory storage, and returning.
The specific step of S2 writing data using disk storage includes:
s201, inquiring memory storage and disk storage, and if the memory storage and the disk storage exist, deleting the memory storage and the disk storage;
s202, writing the data into a memory for storage;
s203 if the memory storage is full, an lru algorithm is executed to replace a data to disk storage.
The specific steps of S3 for additionally processing lru handover of memory data and deletion of stale data by polling thread include:
s201, the polling thread traverses each memory storage data at intervals according to rules;
s202, deleting expired memory data;
s203 switches the data to disk storage through lru algorithm.
The disk storage in S1 is implemented based on a b + tree.
A lightweight active outdated data cache system comprises a data reading module, a data writing module and a data polling module, wherein the data reading module comprises:
a data reading module: reading data by using a magnetic disk storage;
a data writing module: writing data by using a magnetic disk storage;
a data polling module: a polling thread is employed to additionally handle lru switching of memory data and deletion of stale data.
The data reading module specifically comprises a query module A, a query module B and a synchronization module:
the query module A: inquiring memory storage, and if the memory storage is found, directly returning;
and the query module B: if not, inquiring the disk storage, and if not, directly returning;
a synchronization module: if found in the disk storage section, synchronize to memory storage, and return.
The data writing module specifically comprises a query module C, a storing and writing module and a replacing module:
and a query module C: inquiring memory storage and disk storage, and if the memory storage and the disk storage exist, deleting the memory storage and the disk storage;
a storage and writing module: writing the data into a memory for storage;
and a replacement module: if the memory storage is full, the algorithm lru is executed to replace a data to disk storage.
The data polling module specifically comprises a detection module, a clearing module and a switching module:
a detection module: the polling thread traverses each memory storage data at intervals according to rules;
a clearing module: deleting expired memory data;
a switching module: the data is switched to disk storage by the lru algorithm.
And the disk storage in the data reading module is realized based on a b + tree.
The invention has the beneficial effects that: compared with the traditional cache implementation mode, the invention is reformed as follows:
the cache is not completely stored in the memory, but respectively stored in the memory and the disk;
the memory only stores and accesses more data, and less data is stored in the disk, so that the occupation of the memory is reduced;
data switching between the memory and the disk is synchronously performed in an lru + bloom filtering + polling mode, so that cpu consumption is reduced as much as possible;
the cache component does not need to be independently deployed, and is embedded in the process of a user in the forms of a dynamic link library, a jar packet and the like, so that extra maintenance cost and network communication cost are avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of the method of the present invention; FIG. 2 is a schematic diagram of the system of the present invention; FIG. 3 is a diagram of a cache user process architecture; fig. 4 is a diagram of a conventional cache-mode architecture.
Detailed Description
The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.
The first embodiment is as follows:
a lightweight active outdated data caching method comprises the following specific steps:
s1, reading data by using a disk storage;
s2 writing data by using disk storage;
s3, adopting polling thread to process lru switching of memory data and deleting of expired data;
further, the step S1 of reading data by using the disk storage specifically includes:
s101, inquiring memory storage, and if the memory storage is found, directly returning;
s102, if not found, inquiring the disk storage, and if not found, directly returning;
s103, if the data is found in the disk storage part, synchronizing to the memory storage and returning;
further, the step S2 of writing data by using disk storage specifically includes:
s201, inquiring memory storage and disk storage, and if the memory storage and the disk storage exist, deleting the memory storage and the disk storage;
s202, writing the data into a memory for storage;
s203, if the memory storage is full, an lru algorithm is executed to replace one data to the disk for storage;
further, the specific steps of S3 for additionally processing lru switching of memory data and deleting expired data by using a polling thread include:
s201, the polling thread traverses each memory storage data at intervals according to rules;
s202, deleting expired memory data;
s203, switching the data to a disk storage through lru algorithm;
still further, the disk storage in S1 is implemented based on a b + tree;
example two:
a lightweight active outdated data cache system comprises a data reading module, a data writing module and a data polling module, wherein the data reading module comprises:
a data reading module: reading data by using a magnetic disk storage;
a data writing module: writing data by using a magnetic disk storage;
a data polling module: a polling thread is employed to additionally handle lru switching of memory data and deletion of stale data.
Further, the data reading module specifically includes an inquiry module a, an inquiry module B, and a synchronization module:
the query module A: inquiring memory storage, and if the memory storage is found, directly returning;
and the query module B: if not, inquiring the disk storage, and if not, directly returning;
a synchronization module: if found in the disk storage section, synchronize to memory storage, and return.
Further, the data writing module specifically includes a query module C, a storing and writing module, and a replacement module:
and a query module C: inquiring memory storage and disk storage, and if the memory storage and the disk storage exist, deleting the memory storage and the disk storage;
a storage and writing module: writing the data into a memory for storage;
and a replacement module: if the memory storage is full, the algorithm lru is executed to replace a data to disk storage.
Further, the data polling module specifically includes a detection module, a clearing module and a switching module:
a detection module: the polling thread traverses each memory storage data at intervals according to rules;
a clearing module: deleting expired memory data;
a switching module: the data is switched to disk storage by the lru algorithm.
Still further, the disk storage in the data reading module is implemented based on a b + tree.
Compared with the traditional cache implementation mode, the invention is reformed as follows:
the cache is not completely stored in the memory, but respectively stored in the memory and the disk;
the memory only stores and accesses more data, and less data is stored in the disk, so that the occupation of the memory is reduced;
data switching between the memory and the disk is synchronously performed in an lru + bloom filtering + polling mode, so that cpu consumption is reduced as much as possible;
the cache component does not need to be independently deployed, and is embedded in the process of a user in the forms of a dynamic link library, a jar packet and the like, so that extra maintenance cost and network communication cost are avoided;
the data switching synchronization process is as follows:
dividing a storage layer into a memory store (memory store), a disk store (disk store) and a polling thread (poll thread), wherein the memory store is carried out by using a hash table and lru, the memory allocation is not carried out in an additional memory pool, the data is checked during each reading and deleted if the data is expired, and the data is synchronized to the disk store if the memory is full during writing; disk storage is implemented based on a b + tree for storing data replaced by memory storage lru; the polling thread is used for periodically switching part of old data in the memory storage to the disk storage in an lru mode;
the method comprises the following steps that a problem of accessing a disk exists in the inquiry process of data from memory storage to disk storage, if the disk is accessed and inquired too frequently, the performance is influenced, and a bloom filter (bloom filter) is added for recording data written into the disk;
the memory storage part can realize basic caching operation, the disk storage can cache data switched from the memory storage through lru in the current disk so as to solve the limitation of the memory, and the timed polling of the memory storage can solve the problem of cpu high consumption caused by frequent detection existing in the processes of expiration and lru replacement in the traditional cache; the cache only needs to use a memory storage part, so that the dependence on the memory of the system is small;
the whole process is as follows:
data reading:
(1) inquiring memory storage, and if the memory storage is found, directly returning;
(2) if not, inquiring the disk storage, and if not, directly returning;
(3) if the data is found in the disk storage part, synchronizing to the memory storage and returning;
data writing:
(1) inquiring memory storage and disk storage, and if the memory storage and the disk storage exist, deleting the memory storage and the disk storage;
(2) writing the data into a memory for storage;
(3) if the memory storage is full, then the algorithm lru is executed to replace a data to disk storage;
data polling detection:
(1) the polling thread traverses each memory storage data at intervals according to rules;
(2) deleting expired memory data;
(3) the data is switched to disk storage by the lru algorithm.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A lightweight active expired data caching method is characterized by comprising the following specific steps:
s1, reading data by using a disk storage;
s2 writing data by using disk storage;
s3 employs polling threads to additionally handle lru switching of memory data and deletion of stale data.
2. The method as claimed in claim 1, wherein the step of S1 reading the data using the disk storage comprises:
s101, inquiring memory storage, and if the memory storage is found, directly returning;
s102, if not found, inquiring the disk storage, and if not found, directly returning;
s103, if the data is found in the disk storage part, synchronizing to the memory storage, and returning.
3. The method as claimed in claim 2, wherein the step of S2 writing data to the disk storage comprises:
s201, inquiring memory storage and disk storage, and if the memory storage and the disk storage exist, deleting the memory storage and the disk storage;
s202, writing the data into a memory for storage;
s203 if the memory storage is full, an lru algorithm is executed to replace a data to disk storage.
4. The method as claimed in claim 3, wherein the step of S3 of using polling thread to additionally process lru switching of memory data and deleting expired data comprises:
s201, the polling thread traverses each memory storage data at intervals according to rules;
s202, deleting expired memory data;
s203 switches the data to disk storage through lru algorithm.
5. The method as claimed in claim 4, wherein said disk storage in S1 is implemented based on a b + tree.
6. A lightweight active overdue data cache system is characterized in that the method specifically comprises the following steps:
a data reading module: reading data by using a magnetic disk storage;
a data writing module: writing data by using a magnetic disk storage;
a data polling module: a polling thread is employed to additionally handle lru switching of memory data and deletion of stale data.
7. The system according to claim 6, wherein the data reading module comprises a query module a, a query module B and a synchronization module:
the query module A: inquiring memory storage, and if the memory storage is found, directly returning;
and the query module B: if not, inquiring the disk storage, and if not, directly returning;
a synchronization module: if found in the disk storage section, synchronize to memory storage, and return.
8. The system of claim 7, wherein the data writing module specifically comprises a query module C, a write module, and a replace module:
and a query module C: inquiring memory storage and disk storage, and if the memory storage and the disk storage exist, deleting the memory storage and the disk storage;
a storage and writing module: writing the data into a memory for storage;
and a replacement module: if the memory storage is full, the algorithm lru is executed to replace a data to disk storage.
9. The system of claim 8, wherein the data polling module specifically comprises a detection module, a clearing module, and a switching module:
a detection module: the polling thread traverses each memory storage data at intervals according to rules;
a clearing module: deleting expired memory data;
a switching module: the data is switched to disk storage by the lru algorithm.
10. The system of claim 9, wherein the disk storage in the data read module is implemented based on a b + tree.
CN202110135675.7A 2021-02-01 2021-02-01 Lightweight active overdue data caching method and system Pending CN113190169A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110135675.7A CN113190169A (en) 2021-02-01 2021-02-01 Lightweight active overdue data caching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110135675.7A CN113190169A (en) 2021-02-01 2021-02-01 Lightweight active overdue data caching method and system

Publications (1)

Publication Number Publication Date
CN113190169A true CN113190169A (en) 2021-07-30

Family

ID=76972835

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110135675.7A Pending CN113190169A (en) 2021-02-01 2021-02-01 Lightweight active overdue data caching method and system

Country Status (1)

Country Link
CN (1) CN113190169A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037660A1 (en) * 2007-08-04 2009-02-05 Applied Micro Circuits Corporation Time-based cache control
US20090292679A1 (en) * 2008-05-21 2009-11-26 Oracle International Corporation Cascading index compression
CN102999428A (en) * 2012-11-01 2013-03-27 华中科技大学 Four-stage addressing method for tile recording disk
CN107168657A (en) * 2017-06-15 2017-09-15 深圳市云舒网络技术有限公司 It is a kind of that cache design method is layered based on the virtual disk that distributed block is stored
CN109902088A (en) * 2019-02-13 2019-06-18 北京航空航天大学 A kind of data index method towards streaming time series data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037660A1 (en) * 2007-08-04 2009-02-05 Applied Micro Circuits Corporation Time-based cache control
US20090292679A1 (en) * 2008-05-21 2009-11-26 Oracle International Corporation Cascading index compression
CN102999428A (en) * 2012-11-01 2013-03-27 华中科技大学 Four-stage addressing method for tile recording disk
CN107168657A (en) * 2017-06-15 2017-09-15 深圳市云舒网络技术有限公司 It is a kind of that cache design method is layered based on the virtual disk that distributed block is stored
CN109902088A (en) * 2019-02-13 2019-06-18 北京航空航天大学 A kind of data index method towards streaming time series data

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
朱振: "校园信息管理系统缓存模块的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
胡元义 等, 西安电子科技大学出版社 *
葛凯凯: "Ceph文件系统元数据访问性能优化研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Similar Documents

Publication Publication Date Title
US8225029B2 (en) Data storage processing method, data searching method and devices thereof
CN107168657B (en) Virtual disk hierarchical cache design method based on distributed block storage
US8954648B2 (en) Memory device and operating method thereof
CN103246616A (en) Global shared cache replacement method for realizing long-short cycle access frequency
US20020099907A1 (en) System and method for storing data sectors with header and trailer information in a disk cache supporting memory compression
CN102436421B (en) Data cached method
US20120102298A1 (en) Low RAM Space, High-Throughput Persistent Key-Value Store using Secondary Memory
US20090327621A1 (en) Virtual memory compaction and compression using collaboration between a virtual memory manager and a memory manager
US8868863B2 (en) Method and apparatus for a frugal cloud file system
CN103399823B (en) The storage means of business datum, equipment and system
CN101645043B (en) Methods for reading and writing data and memory device
US7818505B2 (en) Method and apparatus for managing a cache memory in a mass-storage system
CN103270499B (en) log storing method and system
JP2005535051A (en) System and method for using compressed main memory based on compressibility
CN110555001B (en) Data processing method, device, terminal and medium
CN105426321A (en) RDMA friendly caching method using remote position information
CN102117287A (en) Distributed file system access method, a metadata server and client side
CN108089825A (en) A kind of storage system based on distributed type assemblies
CN110109927A (en) Oracle database data processing method based on LSM tree
US20190220443A1 (en) Method, apparatus, and computer program product for indexing a file
CN114356877A (en) Log structure merged tree hierarchical storage method and system based on persistent memory
CN105653720A (en) Database hierarchical storage optimization method capable of achieving flexible configuration
CN108377394A (en) Image data read method, computer installation and the computer readable storage medium of video encoder
CN107133369A (en) A kind of distributed reading shared buffer memory aging method based on the expired keys of redis
US11281594B2 (en) Maintaining ghost cache statistics for demoted data elements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210730

RJ01 Rejection of invention patent application after publication