CN114443722A - Cache management method and device, storage medium and electronic equipment - Google Patents

Cache management method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN114443722A
CN114443722A CN202210122494.5A CN202210122494A CN114443722A CN 114443722 A CN114443722 A CN 114443722A CN 202210122494 A CN202210122494 A CN 202210122494A CN 114443722 A CN114443722 A CN 114443722A
Authority
CN
China
Prior art keywords
cache
data
module
admission
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210122494.5A
Other languages
Chinese (zh)
Inventor
汪源
余利华
蒋鸿翔
叶子皓
温正湖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Netease Shuzhifan Technology Co ltd
Original Assignee
Netease Hangzhou Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Hangzhou Network Co Ltd filed Critical Netease Hangzhou Network Co Ltd
Priority to CN202210122494.5A priority Critical patent/CN114443722A/en
Publication of CN114443722A publication Critical patent/CN114443722A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Abstract

According to the cache management method, the cache management device, the storage medium and the electronic equipment provided by the embodiment of the disclosure, at least one cache admission rule is read based on a configuration file; wherein the at least one cache admission rule comprises a cache admission rule associated with a hot data feature; configuring the cache admission rule to a cache admission control module; in response to the input data to be cached, the cache admission control module matches target data in the data to be cached based on a cache admission rule and allows the target data to be written into the cache module. Therefore, the cache access control module configured with the cache access rule related to the hot data characteristic is used for screening the data write-in conforming to the rule, and therefore the cache hit rate is improved. In addition, the query performance can be improved by using an asynchronous cache writing mechanism of different threads; for another example, index persistence and recovery of the cached data are utilized to keep the cached data reusable, so that query efficiency and user experience are improved.

Description

Cache management method and device, storage medium and electronic equipment
Technical Field
The embodiment of the disclosure relates to the technical field of databases, in particular to a cache management method, a cache management device, a storage medium and an electronic device.
Background
This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims and the description herein is not admitted to be prior art by inclusion in this section.
On-Line Processing in data Processing of a database can be classified into On-Line Transaction Processing (OLTP) and On-Line Analytical Processing (OLAP). OLTP is primarily directed to data processing; OLAP is primarily directed to data analysis to mine data value.
In the data processing process of the data query engine processed on line, the original data in the data source needs to be read first for further processing and calculation, and the speed of reading the data directly determines the speed of the overall execution, so that the performance optimization of the data reading link is necessary. To improve performance, a cache module is configured between the data query engine and the database to cache data in the database so that the data can be quickly read when needed, thereby improving the speed of data reading.
Disclosure of Invention
In this context, embodiments of the present disclosure provide a cache management method, apparatus, storage medium, and electronic device.
According to a first aspect of the present disclosure, there is provided a cache management method for managing a cache module used by a data query engine for querying a distributed database in a network for data; the method comprises the following steps: reading at least one cache admission rule based on the configuration file; wherein the at least one cache admission rule comprises a cache admission rule associated with a hot data feature; configuring the cache admission rule to a cache admission control module; in response to the input data to be cached, the cache admission control module matches target data in the data to be cached based on a cache admission rule and allows the target data to be written into the cache module. Therefore, the cache access control module configured with the cache access rule related to the hot data characteristics is used for screening the data write-in conforming to the rule, so that the cache hit rate is improved, and the query performance is improved; optionally, the cache admission control module may further be configured to serve as an interface for inputting a control instruction.
In some embodiments of the first aspect, the method further comprises: responding to the target data to be written into the cache module, creating a cache task corresponding to the target data, and acquiring a buffer area; reading the target data by a reading thread and copying the target data to the buffer area; and the cache writing thread reads the target data from the buffer area based on the cache task and writes the target data into the cache module. Therefore, the cache write-in thread which works asynchronously with the read thread is responsible for writing target data into the cache, when cache miss occurs, the work of the read thread is changed from one-time complete cache write-in to one-time data copying to the buffer area, the time of returning the data to the scan thread by the read thread is obviously prolonged, the query performance reduction caused by cache miss is avoided, and the smoother query performance is provided.
In some embodiments of the first aspect, the cache module includes a memory region and a disk region; the disk area is used for storing cache data; the memory area is used for storing cache metadata, and the cache metadata is used for indexing cache data in the disk area; the method comprises a cache persistence process, which comprises: setting the cache module to be read-only in response to a persistence instruction; creating a medium object, and storing the cache metadata and the description object of the cache data into the medium object; and the medium object is subjected to persistence processing to form persistent data and is stored in the disk area. By persisting and recovering the cache metadata used for indexing the cache data at a proper time, the problem that the cache data is not reusable and needs to be refilled after restarting in the related technology is avoided, and the query efficiency and the user experience are effectively improved.
According to a second aspect of the present disclosure, there is provided a cache management apparatus for managing a cache module used by a data query engine, the data query engine being configured to query a distributed database in a network for data; the device comprises: the rule reading module is used for reading at least one cache admission rule based on the configuration file; wherein the at least one cache admission rule comprises a cache admission rule associated with a hot data feature; the first configuration module is used for configuring the cache admission rule to a cache admission control module; the cache admission control module is used for responding to input data to be cached, matching target data in the data to be cached based on a cache admission rule, and allowing the target data to be written into the cache module.
According to a third aspect of the present disclosure, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements: a cache management method as claimed in any one of the first aspect.
According to a fourth aspect of the present disclosure, there is provided an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform, via execution of the executable instructions: a cache management method as claimed in any one of the first aspect.
And screening data write-in conforming to the rules by using a cache access control module configured with cache access rules related to hot data characteristics, so that the cache hit rate is improved. In addition, the query performance can be improved by using an asynchronous cache writing mechanism of different threads; for another example, the index persistence and recovery of the cache data are utilized to keep the cache data reusable, thereby improving the query efficiency and the user experience.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
fig. 1 shows an architecture diagram of an exemplary application scenario in an embodiment of the present disclosure.
Fig. 2A is a schematic diagram illustrating a principle of reading data from a buffer in an embodiment of the related art.
FIG. 2B is a schematic diagram illustrating a principle of writing data to a cache module in an embodiment of the related art.
Fig. 3 shows a flowchart of a cache management method in an embodiment of the present disclosure.
Fig. 4 shows a schematic flow chart of asynchronous cache writing in the cache management method in the embodiment of the present disclosure.
Fig. 5 is a schematic diagram illustrating a cache write flow in an application example of the present disclosure.
Fig. 6A shows a flow diagram of a cache persistence process in an embodiment of the present disclosure.
Fig. 6B shows a flow diagram of a cache persistence and recovery process in an embodiment of the disclosure.
FIG. 7 illustrates a flow diagram for persistence and recovery in an application example of the present disclosure.
Fig. 8 is a schematic block diagram illustrating a cache management apparatus according to an embodiment of the present disclosure.
FIG. 9 shows a schematic diagram of a storage medium in an embodiment of the disclosure.
Fig. 10 shows a schematic structural diagram of an electronic device in an embodiment of the present disclosure.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
The principles and spirit of the present disclosure will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present disclosure, and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
According to the embodiment of the disclosure, a cache management method, a cache management device, a storage medium and an electronic device are provided.
In this document, any number of elements in the drawings is intended to be illustrative and not restrictive, and any nomenclature is used for distinction only and not for any restrictive meaning.
The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments of the present disclosure.
Summary of The Invention
With the continuous development of big data technology, how to process and analyze the data efficiently is a very interesting problem for enterprises in the industry. The online processing method of data is classified into OLTP and OLAP. In contrast to OLTP, OLAP can meet the requirements of data analysis, and accordingly, a variety of OLAP data query engines have emerged. Such as Hive, Spark, Presto, Kylin, Impala, Druid, Clickhouse, and the like.
In order to achieve the best performance, the OLAP data query engine needs to seek speed and efficiency in each link of the query execution process. Generally, the process of executing a query by a data query engine mainly includes: SQL command parsing (e.g., lexical analysis, syntax analysis, semantic analysis, etc. on SQL commands), making execution plans, optimizing execution plans, and executing execution plans. The execution plan includes various steps required for completing the query, including, for example, performing table-to-table join operations, calculating aggregation functions such as summation/averaging, processing such as sorting/grouping data, scanning and reading data from a data source, and the like. In these steps, the data scanned from the data source is often located at the first position of the execution plan, i.e. the raw data is read before further processing and calculation. From this, it is understood that the speed of reading data determines the speed of the overall execution of the execution plan. Therefore, the performance optimization of the data reading link determines the query performance of the data query engine.
Setting a cache is a means to improve the performance of the data reading link. As illustrated in fig. 1, a data query engine 101 and a database 102 in which data sources reside are illustrated. The data reading speed is improved by arranging a cache module 103 between the data query engine 101 and the database 102 for caching the common data in the database 102 so as to be preferentially and quickly read from the cache module 103 when needed.
In a possible example, the database 102 may be a Distributed database, such as a Distributed File System (HDFS) deployed in the network 104, and the data query engine 101 needs to query the required data from the local to the network 104 side, so that the number of accesses to the network 104 may be further reduced by configuring the cache module 103 for the data query engine 101, so as to reduce interference on the data reading speed caused by network 104 fluctuation.
Taking the database 102 as an HDFS as an example, the cache module 103(DataCache) of the data query engine 101 may be disposed in the local device 100 (e.g., a server or other types of devices), and may include storage areas in a disk and a memory of the local device 100, where the disk and the memory area are respectively used to store a part of data, and a first part of the data is stored in a local disk space, which is called a cache file (CacheFile), for example, to store data fragments of the HDFS file. The second part of data is stored in a local memory space, called cache metadata (MetaCache), which stores the mapping relationship from a certain data segment index of the HDFS to a certain position of a cache file in the disk space. When the data query engine 101 needs to read a certain piece of data of a certain file on the HDFS, the certain piece of data is queried in the cache module 103. If the cache module 103 caches the data, it will read the data, which is called a cache hit; or, if the data is not cached, the data needs to be queried in the HDFS.
Specifically, as shown in fig. 2A, a schematic diagram of reading data from a buffer in an embodiment of the related art is shown.
Specifically, in fig. 2A, the reading module (HdfsFileReader) responsible for reading the cached data submits, to the caching module, the relevant information of the target data to be read, including the file name (Filename), the last modification time (Mtime) and the offset (offset) of the full path, which form a cache key (CacheKey) to indicate the location of the target data, for example, a certain piece of data of a certain file of a certain version. The Cache metadata (MetaCache) records the association relationship between all Cache keys and Cache entries (CacheEntry), a unique corresponding Cache entry can be found through the Cache keys, the Cache entries include the location information of the target data recorded in the Cache File, and the location information may include Cache File (Cache File) information, offset (offset), and data length (Len). The process of searching and reading data according to the cache shown in fig. 2A includes steps S201A to S204A. As shown in S201A, the cache metadata is searched for whether all the cache metadata exists, i.e. the cache hit (hit) is indicated, through each cache key of the target data, as shown in S202A. If all the data are hit, as shown in S203A, the byte number (bytes to read, btr) to be read provided by the reading module is obtained, min (btr, len) is calculated according to the smaller value of the byte number to be read and the length (len) of the segment, the actually read byte number is determined, and as shown in S204A, the target data is read from the buffer module to the buffer (buffer) and returned to the reading module.
In addition, if there is a miss, for example, the cache key is not found completely in the cache metadata, the reading module needs to continue reading data from the HDFS on the network side and then writing the data into the cache module, which is not specifically shown in the figure for simplicity of illustration.
Referring to fig. 2B, a schematic diagram of a write (Store) data to cache module in an embodiment of the related art is shown. The write data flow is schematically shown in FIG. 2B as including steps S201B-S204B.
As shown in the figure, during the cache write, a buffer (buffer) with a buffer length (buffer _ len) equal to the write data length is first allocated (allocated) in the cache file for storing the data to be written into the cache module, and as shown in S201B, the data is written into the cache file from the buffer. As in S202B, generating a cache entry according to the written information of the cache file, the written location, and the data length, and inserting cache metadata; and as S203B, the cache key insertion cache metadata is associated with the cache entry, the insertions in steps S202B and S203B may be performed in parallel. If the total amount of data cached by the cache module exceeds the configured limit, step S204B triggers the cache entry to be eliminated. According to a preset cache elimination strategy, the cache module continuously evicts cache entries until the total amount of cache data is lower than a limit. When a cache entry is evicted from the cache metadata, the data segment in the cache file corresponding to the index is also deleted.
According to the above, by using the local cache module, the data query engine can change the remote reading behavior from the distributed database through the network into the local reading behavior, thereby increasing the reading speed and reducing the performance fluctuation.
As described above, although the cache module can improve the query performance of the data query engine, there still exist some problems due to the immature technology. For example, cache hit rates are low, cache writes are slow, and cache data cannot be reused due to index loss.
On one hand, the problem of low cache hit rate has more influence factors, for example, the influence factors are related to the data size and query rule of user query, the capacity of the cache itself, cache elimination strategies and other configurations, and in general, the influence factors are mainly caused by the fact that hot data is evicted from the cache by cold data. Hot data refers to data that is frequently read when queried by a user, and cold data refers to data that is read less frequently when queried by a user. It is clear that caching hot data results in higher query hit rates. Although cache correspondence has a data access based cache eviction policy such as Least Recently Used (LRU), hot data is inevitably evicted from the cache when a large or continuous number of cold data queries are encountered, resulting in a reduction in query hit rate.
On the other hand, the problem of slow cache writing is mainly related to the mechanism of data query engine to read data. In the process of inquiring data, firstly, submitting the position information (such as file path, address and the like) of a data segment needing to be read to a reading thread through a scanning thread, and searching the data segment in a cache by the reading thread; if the data segment is not found, the data segment is found by accessing the database, and the read thread returns the found data segment to the scanning thread. Since there is more than one thread to read, there is a write lock to protect when they need to write to the cache concurrently. Therefore, the corresponding unlocking and locking of each read thread brings extra time overhead, and much time is needed for the read threads to write data into the corresponding disk space of the cache. These factors together cause the speed of the query to be significantly slower on a cache miss, even than if the cache function is not turned on.
In yet another aspect, the problem of buffer data not being reusable is addressed. The cache includes a disk space for storing cache data and a memory space for storing cache metadata for indexing the cache data in the disk. Because the memory is volatile, when the memory is powered off due to equipment restart and the like, cache metadata in the memory is lost, so that cache data in the disk space is lost and cannot be reused, and the cache data can only be discarded and refilled with data. With the increase of the size of the cache data, the data amount of the cache data which cannot be reused and the corresponding filling data also increases, which causes the data query engine to be in an inefficient query state for a long time (for example, 1 to 2 days) after each restart.
In view of this, embodiments of the present disclosure provide a cache management method, an apparatus, a storage medium, and an electronic device, in which a cache admission control module configured with a cache admission rule related to a hot data feature screens data meeting an admission criterion before the data can be written into a cache, so as to solve a problem of a decrease in a cache hit rate. Illustratively, the cache hit rate can also be improved by configuring cache admission rules related to the critical data to match the user requirements.
Illustratively, the cache write work can be further taken charge of by the cache write thread asynchronous to the read thread, and the read thread and the cache write work are decoupled, so that the waiting time of the read thread is reduced, the time for returning data to the scanning thread in a block can be added, and the query performance reduction during cache miss is avoided.
Illustratively, the index loss situation caused by restart can be dealt with by persisting and recovering cache metadata and the like, the index can be continuously indexed and the cache data can be reused after restart by utilizing the recovered cache metadata, the query inefficiency situation after restart is avoided, and the query performance is improved.
Exemplary method embodiments
Referring to fig. 3, a schematic flow chart of a cache management method according to an embodiment of the disclosure is shown. The cache management method is used for managing a cache module used by a data query engine, and the data query engine is used for querying data to a distributed database in a network. In some examples, the data query engine may be the above-mentioned OLAP query engine, i.e., a data warehouse query engine, such as Impala, and the caching module may be a DataCache, such as Impala. The data query engine and the cache module can be deployed in a local server and access the distributed database through a network. In some examples, the distributed database may be, for example, HDFS or the like. In other examples, the data query engine may also be an OLTP data query engine.
In fig. 3, the cache management method includes:
step S301: at least one cache admission rule is read based on the configuration file.
Wherein the at least one cache admission rule comprises a cache admission rule associated with a hot data feature.
In some embodiments, a configuration file is provided in step S301 to provide the cache admission control module with cache admission rules. The user can write the required cache admission rule in the configuration file in advance, and can dynamically and flexibly set the required cache admission rule. Cache admission rules associated with hot data characteristics for allowing matching data, which may be hot data, to be written to cache to exclude cold data
Step S302: and configuring the cache admission rule to a cache admission control module.
In some embodiments, the cache admission control module may store the read cache admission rules to implement the configuration.
Step S303: in response to the input data to be cached, the cache admission control module matches target data in the data to be cached based on a cache admission rule and allows the target data to be written into the cache module.
In some embodiments, the cache admission control module may match the cache data to be written into the cache with the stored cache admission rule, and the successfully matched cache data to be cached is the target data and is allowed to be written into the cache module.
The method has the advantages that the data meeting the hot data characteristics are allowed to be written into the cache module by setting the designated rule, and the data (possibly cold data) not meeting the hot data characteristics are refused to be written into the cache module, so that the hot data can be prevented from being evicted from the cache by the cold data, and the cache hit rate is effectively improved.
Various technical implementation details of the cache management method are described in the following through various embodiments.
In some embodiments, the configuration file may be used as a tool for a user to configure the cache admission control module, and by writing a required cache admission rule into the configuration file in advance and storing the required cache admission rule, the configuration file may be called and the cache admission rule therein may be read when the cache admission control module operates, so as to control cache writing.
In some embodiments, the hot data characteristics are related to the frequency of data access, etc., and the corresponding cache admission rule may be determined by combining the frequency of data access and some characteristic dimensions of the database. Illustratively, libraries, tables, data partitions, time partitions, etc. of a database are organized in the HDFS in the form of a file directory tree, and the related information may be contained in the file name (Filename) of the data. For example, the file name of the data is hdfs:// test/abcd.db/mmmm/date ═ 20 xx-xx-xx-xx/bbbbbb, indicating that this data belongs to the mmmm table of the database abcd, which is partitioned by date field date, and that the data is for xx month xx day in 20xx year. Based on this, the cache admission rules related to the hot data characteristics can be determined by combining the dimensions of the above library, table, data partition, time partition, etc. with the dimensions of data query frequency, etc. For example, by analyzing the condition that the query frequency of the user on a certain table is mainly concentrated in a certain time partition (such as the latest a-B days), a cache admission rule corresponding to the time partition can be set, and the data of the latest a-B days of the table is specified to be allowed to be cached.
As can be appreciated from the above description, in some embodiments, the type of cache admission rule associated with the hot data characteristic may include at least one of: cache admission rules on the library level; cache admission rules on a table level; cache admission rules for data partitions; cache admission rules with respect to time partitions, etc. The cache admission rule about the library level may be for one or more target databases, which indicates that data in the target databases are likely to be hot data, and data of the one or more target databases is allowed to be written into the cache module, and identification information (such as a database name and/or an ID, etc.) of the target databases may be included in the cache admission rule. Similarly, the cache admission rules at the table level may indicate that data in the target table is allowed to be written into the cache module by including identification information (e.g., table name) of one or more target tables. The cache admission rules for a data partition may include, for example, identification information of one or more target data partitions in a target table to indicate that data in the target data partitions in the target table are allowed to be written to the cache module. The cache admission rule for time-division of data may include time information of a target time-division to indicate that data in the time-division is allowed to be written into the cache module.
To intuitively illustrate the cache admission rules associated with hot data features, an example of implementation of several of the above rule types is exemplarily shown by the following table.
Figure BDA0003499006280000101
Figure BDA0003499006280000111
It should be noted that, in some embodiments, the cache admission rule related to the thermal characteristic may also be determined based on some other characteristic dimensions. E.g., cache admission rules regarding geographical location. Such as data write cache modules of devices belonging to a specified area, etc. Therefore, the above feature dimensions are merely exemplary and not limiting.
In some embodiments, a cache admission rule corresponding to custom-defined data to be cached may also be set. Such data to be cached may be cached to meet specific user requirements or to facilitate acceleration of the relevant application using the data to be cached. Illustratively, the type of the cache admission rule for the data to be cached may further include at least one of the following: a cache admission rule regarding data of interest to a user; cache admission rules for small file table data; a cache admission rule regarding cache metadata used to index the cache data; a cache admission rule for data of the materialized view table. Each rule is explained one by one as follows.
In a possible example, the cache admission rule regarding the data of interest to the user may be determined according to the characteristics of the data of interest to the important user. For example, report forms needed by higher-level clients and important clients are cached to inquire corresponding library/table/partition data.
In a possible example, the cache admission rule for the small file table data may be used to allow the data of the small file table with a data volume smaller than a preset data threshold or a relative data volume to be cached, so that a certain merging effect of the small file table in the local cache module may be achieved, and it is not necessary to frequently read a large number of small file tables from the database through the network.
In a possible example, the cache admission rule regarding cache metadata for indexing cache data is described. For example, Apache partial is a column-oriented data storage format in the Apache Hadoop ecosystem. The partial file includes data (data) and Metadata (Metadata). Structurally, a partial file includes a header, one or more blocks, and a footer block, which stores metadata of the partial file. Although reading the queue file can be concurrent in multiple threads, the font block needs to be read first. Therefore, the whole partial file does not need to be cached, only the folder data containing the metadata is cached, the query performance can be improved, and the required cache space is small.
In a possible example, a cache admission rule for data of the materialized view graph. By caching the data of the materialized view table, the operation of the materialized view table is accelerated, and the effect of the materialized view can be better played.
For performance, in some embodiments, the cache admission rules may be recorded by using a method such as aggregation, hash table, etc., and when data is matched, the matching result may be given in a constant time, which has little influence on the performance of the cache module. In some embodiments, in order to facilitate the user to adjust the cache rules, the configuration file may be dynamically hot-loadable, and the cache module may periodically check for modifications to the configuration file and, if there are changes, re-read the configuration file to obtain updated cache admission rules.
In some embodiments, the cache admission control module may not only be limited to configuring the cache admission rules, but also may configure, for example, control instructions, etc., that is, the cache admission control module may also be used as an input interface for the control instructions. Accordingly, a user can write a control instruction in the configuration file, so that the cache admission control module can obtain the control instruction based on the configuration file and configure the control instruction to the cache admission control module to trigger a corresponding control action. For example, an instruction may be written in the configuration file to reset relevant index items of the cache module, such as cache hit rate, so as to observe performance changes of the cache module after the cache admission rules are modified.
The cache hit rate is improved by screening the data write cache module through the cache admission control module.
See again the slow cache writes due to the working mechanism of the read thread. The write speed of the storage medium is not determined by the cache module, and the lock is also a necessary safety measure for multi-thread concurrent writing, so it is very difficult to solve the slow cache writing through the two aspects. Therefore, this problem is solved in an indirect manner in the embodiments of the present disclosure. As described above, the read thread needs to complete the cache writing before returning the data to the scan thread, and in the embodiment of the present disclosure, an asynchronous cache mechanism may be provided, the cache writing task in charge of the read thread is transferred to other asynchronously working threads, and the read thread may directly return the data to the scan thread after completing the task submission, so that the return speed of the read thread is effectively increased, and return delay caused by reasons such as data query to a network side database in case of miss, contention between read threads, and the like is avoided.
In addition, the problem of slow query caused by the working mode of the read thread can be relieved to a certain extent by realizing an asynchronous cache writing mechanism.
Fig. 4 is a schematic diagram illustrating a flow of asynchronous cache writing in the cache management method according to the embodiment of the disclosure.
Illustratively, the embodiment of fig. 4 may be combined with the embodiment of fig. 3, and the target data which is allowed to be written into the cache module is executed asynchronous cache writing after the cache admission rules of the cache admission module are matched.
In fig. 4, the process includes:
step S401: and responding to the target data to be written into the cache module, creating a cache task corresponding to the target data, and acquiring a buffer area.
In some embodiments, after the target data that the read thread needs to write to the cache is allowed by the cache admission control module, the target data may be encapsulated and a corresponding cache task (StoreTask) may be created. The caching task may include: the basic information (for determining the location of the data) such as the file name (Filename) corresponding to the target data may also include a Buffer area (Buffer) applied to a Buffer pool (Buffer pool). The buffer tasks may be placed in a buffer task queue. The buffer pool can be opened up in the memory to have a faster read-write speed.
In addition, a cache write thread pool may be initialized in advance, wherein the cache write thread pool includes a plurality of cache write threads responsible for writing data into the cache according to the cache task. And the cache write-in thread takes out the cache task when the cache task exists in the task queue and writes the data in the buffer area into the cache according to the information carried by the task.
Step S402: and the reading thread reads the target data and copies the target data to the buffer area.
In some embodiments, the read thread may return only by copying data into the buffer, and subsequent cache write work is handed over to the queued cache write thread in the cache write thread pool.
Since the buffer may be located in memory, the read thread's copy operation is actually a memory copy operation, which is an order of magnitude faster than disk write operations.
Step S403: and the cache writing thread reads the target data from the buffer area based on the cache task and writes the target data into the cache module.
In some embodiments, a resource reclamation mechanism may also be provided. For example, after the cache write thread finishes the cache task, the cache write thread may be destroyed, and the correspondingly used buffer may also be released and returned to the buffer pool for use by the subsequent cache task.
It can be understood that, under the above asynchronous cache mechanism, when a cache miss occurs, the work content of the read thread may be changed from the original complete one-time cache write to a simple memory copy, which may significantly increase the speed of returning data to the scan thread, and solve the problem caused by slow cache write from the overall macroscopic perspective.
Fig. 5 is a schematic diagram illustrating a cache write flow in an application example of the present disclosure. The application of a cache admission control module and an asynchronous caching mechanism is shown, which can be compared with fig. 2B to show the differences.
And the cache admission module obtains and stores the cache admission rules from the configuration file. When the reading module writes data into the cache module, the cache admission control module judges whether the data is the target data according with the cache admission rule or not according to the cache admission rule. If not, refusing to write and returning; and if so, allowing the target data to be written into the cache module. Accordingly, a caching task may be created and an allocated buffer may be applied from a buffer pool, the buffer having a buffer length (buffer _ len). And then, the target data which accords with the cache admission rule is written into the buffer area by the read thread, and is written into the cache file in the disk by the idle cache write thread.
Thereafter, the cache metadata may be written with reference to the cache key and associated cache entry in FIG. 2B. In addition, the caching module executes a eviction mechanism for the cached data.
As described above, since the cache module includes the memory area and the disk area, the disk area is used for storing cache data; the memory area is used for storing cache metadata, and the cache metadata is used for indexing cache data in the disk area, such as the purpose of the cache key and the cache entry associated with the cache metadata in the previous embodiment. The memory component may be a volatile RAM, and after the local device is restarted, the cache metadata in the memory area is lost due to power failure, so that the cache data in the disk cannot be indexed, that is, cannot be reused after being restarted.
Therefore, in order to solve the problem that the cache data cannot be reused, the cache management method in the embodiment of the present disclosure may further include a cache persistence process.
Fig. 6A is a schematic flow chart showing a cache persistence process according to an embodiment of the present disclosure.
In fig. 6A, the cache persistence procedure includes:
step S601: and responding to a persistence instruction, and setting the cache module to be read-only.
In some embodiments, the persistence instruction may result from a data query engine stall/exit; alternatively, the input may be active. In a possible example, the input of the persistence instruction to the caching module may be by writing the persistence instruction to the configuration file to be read by the caching module. And after receiving the persistence instruction, the cache module triggers the cache module to execute a cache persistence process.
In step S601, in order to fix that the current cache data is not changed, it is necessary to set read only. In a specific implementation, read-only can be realized by closing a write (Store) function of the cache module, namely, the Store.
Step S602: creating an intermediary object and storing the cache metadata and the description object of the cache data in the intermediary object.
In some embodiments, in addition to the cache metadata, description objects (CacheFiles) of the cache file also need to be cached, and they record information such as the location of the cache file, and the cache module needs to access the cache file through them. The description object of the cache file may specifically include a file name, a file length, and modification time of the cache file.
Step S603: and the medium object is subjected to persistence processing to form persistent data and is stored in the disk area.
In some embodiments, the intermediary object is a transition between data to be persisted and a disk file, and only the data to be persisted needs to be stored in the Dumper, and then the data is directly serialized (Serialization) to the disk, so that the persistence is completed.
Fig. 6B is a schematic diagram showing a flow of cache persistence and recovery in an embodiment of the present disclosure. The purpose of multiplexing the cache data is achieved by persisting and recovering the cache metadata and the cache file description object.
In fig. 6B, the process includes:
step S601: and responding to a persistence instruction, and setting the cache module to be read-only.
Step S602: creating an intermediary object and storing the cache metadata and the description object of the cache data in the intermediary object.
Step S603: the medium object is subjected to persistence processing to form persistent data, and the persistent data is stored in the disk area;
step S604: the persistent data is retrieved from the disk region in response to a startup of the data query engine.
In some embodiments, after the local device is restarted, for example, and the data query engine is restarted accordingly, this situation may trigger the recovery of the persistent data, so that the cache file in the corresponding disk can be indexed again and reused by the data query engine.
Step S605: and the medium object is obtained by the persistence of the persistence data.
Step S606: and restoring the description objects of the cache metadata and the cache data based on the intermediate object.
Step S605 and step S606 are reverse processes with respect to step S602 and step S603. In order to ensure the reliability of the data, in some embodiments, if an error occurs in any link in the cache recovery process, for example, a read/check persistent data error, a reverse persistent error, a description object error of the recovered cache metadata and/or the cache data, and the like, the original cache module may be abandoned if the data is unreliable, and a new replacement cache module may be created.
Referring also to FIG. 7, a flowchart illustrating persistence and recovery in an exemplary application of the present disclosure is shown.
As shown in fig. 7, a cache persistence flow is shown on the left side of the diagram, after receiving a persistence instruction, the cache module sets a readable cache module, that is, a StopStore, and creates an intermediary object Dumper, and further places a description object CacheFiles of cache data into the intermediary object, places a cache metadata MetaCache into the intermediary object, performs Serialization of the intermediary object to obtain a persistence File (Dump File), stores a disk, and completes the cache persistence flow.
The right side of the figure shows a cache recovery flow, when a data query engine is started, a persistent file is read from a disk, unsequencing obtains a Dumper, CacheFiles and MetaCache are read from the Dumper, Consistency Check (Consistency Check) can be performed, and recovery is completed if the Check is correct.
By persisting the index of the cached data before restarting the data query engine and restoring the persisted data after restarting the data query engine to reform the index of the cached data, the cached data can be reused, the problem that the cached data cannot be reused due to restarting in the related art is solved, the period of low query efficiency and labor time after restarting for a long time are eliminated, and the query efficiency and the user experience are improved.
It should be noted that there is a correlation between the above-listed problems of low cache hit rate, slow data return of the read thread during cache miss, and inability to reuse cache data, and therefore, optimization of one of the problems will bring better optimization effect to other problems during optimization. Therefore, the above listed optimization modes such as improving cache hit rate and cache persistence can be used comprehensively, and a better optimization effect can be achieved than the optimization effects of a single implementation mode.
Exemplary apparatus embodiments
Having described exemplary method embodiments of the present disclosure, a cache management apparatus 800 of an exemplary embodiment of the present disclosure is described next with reference to fig. 8.
Since each functional module or sub-module of the cache management device 800 according to the embodiment of the present disclosure has the same principle as the corresponding step or sub-step of the cache management method in the foregoing exemplary method embodiment, the specific implementation in this embodiment may refer to the corresponding content in the previous cache management method, and therefore, the same technical content is not repeated herein.
Referring to fig. 8, an exemplary embodiment of the present disclosure provides a cache management apparatus 800 for managing a cache module used by a data query engine for querying a distributed database in a network for data; the cache management apparatus 800 includes: a rule reading module 801, configured to read at least one cache admission rule based on the configuration file; wherein the at least one cache admission rule comprises a cache admission rule associated with a hot data feature; a first configuration module 802, configured to configure the cache admission rule to a cache admission control module 803; the cache admission control module 803 is configured to, in response to input data to be cached, match target data in the data to be cached based on a cache admission rule, and allow the target data to be written into the cache module.
In some embodiments, the configuration file is further used for being written into a control instruction corresponding to the cache module; the cache management apparatus 800 includes: the instruction reading module is used for acquiring the control instruction based on the configuration file; a second configuration module, configured to configure the instruction to the cache admission control module 803, for triggering a corresponding control action.
In some embodiments, the control instructions comprise: instructions for resetting a hit rate indicator of the cache.
In some embodiments, the cache admission rules include at least one of: cache admission rules on the library level; cache admission rules on a table level; cache admission rules for data partitions; cache admission rules regarding data partitioning by time.
In some embodiments, the cache admission rule further comprises: the cache admission rules about the data to be cached comprise at least one of the following: a cache admission rule regarding data of interest to a user; cache admission rules for small file table data; a cache admission rule regarding cache metadata used to index the cache data; a cache admission rule for data of the materialized view table.
In some embodiments, the cache management apparatus 800 includes: the task creating module is used for responding to the target data to be written into the cache module, creating a cache task corresponding to the target data and acquiring a buffer area; the reading thread is used for reading the target data and copying the target data to the buffer area; and the cache write thread is used for reading the target data from the buffer area based on the cache task and writing the target data into the cache module.
In some embodiments, the cache management apparatus 800 further includes: and the buffer area releasing module is used for responding to the completion of the cache task and releasing the corresponding buffer area.
In some embodiments, the cache module includes a memory region and a disk region; the disk area is used for storing cache data; the memory area is used for storing cache metadata, and the cache metadata is used for indexing cache data in the disk area; the cache management apparatus 800 includes a cache persistence module, which includes: the cache setting module is used for responding to the persistence instruction and setting the cache module to be read-only; the intermediary object processing module is used for creating an intermediary object and storing the cache metadata and the description object of the cache data into the intermediary object; and the persistence processing module is used for persistence processing the medium object to form persistence data and storing the persistence data in the disk area.
In some embodiments, the cache management apparatus 800 further includes a cache recovery module, which includes: the persistent data acquisition module is used for responding to the starting of a data query engine and acquiring the persistent data from the disk area; the anti-persistence processing module is used for anti-persisting the persistence data to obtain the intermediary object; and the cache recovery module is used for recovering the cache metadata and the description object of the cache data based on the intermediary object.
In some embodiments, the cache management apparatus 800 includes: and the error processing module is used for giving up recovery and creating a new replaced cache module in response to the recovery of the cache recovery module causing an error.
In some embodiments, the persistence instructions are stored in the configuration file; the cache management apparatus 800 includes: the instruction reading module is used for reading the persistence instruction based on the configuration file; a second configuration module, configured to configure the persistent instruction to the cache admission control module 803, and configured to trigger the cache persistent procedure.
Exemplary storage Medium
Having described the method and apparatus of the exemplary embodiments of the present disclosure, a storage medium of the exemplary embodiments of the present disclosure will be described with reference to fig. 9.
Referring to fig. 9, a storage medium 900 according to an embodiment of the disclosure is described, which may contain program codes and may be run on a device, such as a computer or a mobile terminal, to implement the execution of each step and sub-step of the cache management method in the above embodiments of the disclosure. In the context of this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program code may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Exemplary electronic device
Having described the storage medium of the exemplary embodiment of the present disclosure, next, an electronic device of the exemplary embodiment of the present disclosure will be described with reference to fig. 9.
The electronic device 1000 shown in fig. 10 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure. The electronic device 1000 may be implemented in a server or the like for serving as a local device carrier for the data query engine and cache module.
As shown in fig. 10, the electronic device 1000 is embodied in the form of a general purpose computing device. The components of the electronic device 1000 may include, but are not limited to: the at least one processing unit 1010, the at least one memory unit 1020, and a bus 1030 that couples various system components including the memory unit 1020 and the processing unit 1010.
Wherein the storage unit stores a program code, and the program code can be executed by the processing unit 1010, so that the processing unit 1010 executes the steps and sub-steps of the cache management method described in the above embodiments of the present disclosure. For example, the processing unit 1010 may perform the steps shown in fig. 3, 4, 6A, 6B, and so on.
In some embodiments, the memory unit 1020 may include volatile memory units such as a random access memory unit (RAM)10201 and/or a cache memory unit 10202, and may further include a read only memory unit (ROM) 10203.
In some embodiments, the memory unit 1020 may also include a program/utility 10204 having a set (at least one) of program modules 10205, such program modules 10205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment.
In some embodiments, bus 1030 may include a data bus, an address bus, and a control bus.
In some embodiments, the electronic device 1000 may also communicate with one or more external devices 1100 (e.g., keyboard, pointing device, Bluetooth device, etc.), which may be through an input/output (I/O) interface 1050. Optionally, the electronic device 1000 further comprises a display unit 1040 connected to the input/output (I/O) interface 1050 for displaying. Also, the electronic device 1000 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) via the network adapter 1060. As shown, the network adapter 1060 communicates with the other modules of the electronic device 1000 over the bus 1030. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 1000, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, to name a few.
It should be noted that although in the above detailed description several modules or sub-modules of the cache management device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that the present disclosure is not limited to the particular embodiments disclosed, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. A cache management method is used for managing a cache module used by a data query engine, and is characterized in that the data query engine is used for querying data to a distributed database in a network; the method comprises the following steps:
reading at least one cache admission rule based on the configuration file; wherein the at least one cache admission rule comprises a cache admission rule associated with a hot data feature;
configuring the cache admission rule to a cache admission control module;
in response to the input data to be cached, the cache admission control module matches target data in the data to be cached based on a cache admission rule and allows the target data to be written into the cache module.
2. The cache management method according to claim 1, wherein the configuration file is further used for being written into a control instruction corresponding to the cache module; the method comprises the following steps:
acquiring the control instruction based on the configuration file;
and configuring the instruction to the cache admission control module for triggering a corresponding control action.
3. The cache management method according to claim 1, wherein the cache admission rule comprises at least one of: cache admission rules on the library level; cache admission rules on a table level; cache admission rules for data partitions; cache admission rules regarding data partitioning by time.
4. The cache management method according to claim 1, comprising:
responding to the target data to be written into the cache module, creating a cache task corresponding to the target data, and acquiring a buffer area;
reading the target data by a reading thread and copying the target data to the buffer area;
and the cache writing thread reads the target data from the buffer area based on the cache task and writes the target data into the cache module.
5. The cache management method according to claim 1, wherein the cache module includes a memory area and a disk area; the disk area is used for storing cache data; the memory area is used for storing cache metadata, and the cache metadata is used for indexing cache data in the disk area; the method comprises a cache persistence process, which comprises:
setting the cache module to be read-only in response to a persistence instruction;
creating a medium object, and storing the cache metadata and the description object of the cache data into the medium object;
and the medium object is subjected to persistence processing to form persistent data and is stored in the disk area.
6. The cache management method according to claim 5, further comprising a cache recovery process, which comprises:
responding to the starting of a data query engine, and acquiring the persistent data from the disk area;
de-persisting the persisted data to obtain the intermediary object;
and restoring the description objects of the cache metadata and the cache data based on the intermediate object.
7. The cache management method according to claim 5, wherein the persistence instruction is stored in the configuration file; the method comprises the following steps:
reading the persistence instructions based on the configuration file;
and configuring the persistence instruction to the cache admission control module for triggering the cache persistence process.
8. A cache management device is used for managing a cache module used by a data query engine, and the data query engine is used for querying data to a distributed database in a network; the device comprises:
the rule reading module is used for reading at least one cache admission rule based on the configuration file; wherein the at least one cache admission rule comprises a cache admission rule associated with a hot data feature;
the first configuration module is used for configuring the cache admission rule to a cache admission control module;
the cache admission control module is used for responding to input data to be cached, matching target data in the data to be cached based on a cache admission rule, and allowing the target data to be written into the cache module.
9. A storage medium having a computer program stored thereon, the computer program when executed by a processor implementing:
a cache management method according to any one of claims 1 to 7.
10. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform, via execution of the executable instructions:
a cache management method according to any one of claims 1 to 7.
CN202210122494.5A 2022-02-09 2022-02-09 Cache management method and device, storage medium and electronic equipment Pending CN114443722A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210122494.5A CN114443722A (en) 2022-02-09 2022-02-09 Cache management method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210122494.5A CN114443722A (en) 2022-02-09 2022-02-09 Cache management method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN114443722A true CN114443722A (en) 2022-05-06

Family

ID=81371161

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210122494.5A Pending CN114443722A (en) 2022-02-09 2022-02-09 Cache management method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114443722A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116467353A (en) * 2023-06-12 2023-07-21 天翼云科技有限公司 Self-adaptive adjustment caching method and system based on LRU differentiation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116467353A (en) * 2023-06-12 2023-07-21 天翼云科技有限公司 Self-adaptive adjustment caching method and system based on LRU differentiation
CN116467353B (en) * 2023-06-12 2023-10-10 天翼云科技有限公司 Self-adaptive adjustment caching method and system based on LRU differentiation

Similar Documents

Publication Publication Date Title
US10754835B2 (en) High-efficiency deduplication module of a database-management system
US11775524B2 (en) Cache for efficient record lookups in an LSM data structure
US11429641B2 (en) Copying data changes to a target database
US11182356B2 (en) Indexing for evolving large-scale datasets in multi-master hybrid transactional and analytical processing systems
CN108664359B (en) Database recovery method, device, equipment and storage medium
US9043334B2 (en) Method and system for accessing files on a storage system
US9811577B2 (en) Asynchronous data replication using an external buffer table
EP3465473B1 (en) Versioning and non-disruptive servicing of in-memory units in a database
US20150039837A1 (en) System and method for tiered caching and storage allocation
US7526469B2 (en) Method and system of database management with shared area
CN111324607B (en) SQL statement multiplexing method and device
CN114443722A (en) Cache management method and device, storage medium and electronic equipment
CN113051221A (en) Data storage method, device, medium, equipment and distributed file system
EP2064633B1 (en) System, method and computer program product for managing data
CN111930684A (en) Small file processing method, device and equipment based on HDFS (Hadoop distributed File System) and storage medium
US11775527B2 (en) Storing derived summaries on persistent memory of a storage device
TWI475419B (en) Method and system for accessing files on a storage system
CN116226497A (en) Retrieval method, medium, device and computing equipment
Burleson Creating a Self-Tuning Oracle Database: Automating Oracle9i Dynamic Sga Performance
CN117827117A (en) Method and device for reducing disk I/O and electronic equipment
CN116450966A (en) Cache access method and device, equipment and storage medium
JP2006235903A (en) Access control method for relational database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20231115

Address after: 310052 Room 301, Building No. 599, Changhe Street Network Business Road, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou NetEase Shuzhifan Technology Co.,Ltd.

Address before: 310052 Building No. 599, Changhe Street Network Business Road, Binjiang District, Hangzhou City, Zhejiang Province, 4, 7 stories

Applicant before: NETEASE (HANGZHOU) NETWORK Co.,Ltd.