CN113177031A - Processing method and device for database shared cache, electronic equipment and medium - Google Patents

Processing method and device for database shared cache, electronic equipment and medium Download PDF

Info

Publication number
CN113177031A
CN113177031A CN202110428707.2A CN202110428707A CN113177031A CN 113177031 A CN113177031 A CN 113177031A CN 202110428707 A CN202110428707 A CN 202110428707A CN 113177031 A CN113177031 A CN 113177031A
Authority
CN
China
Prior art keywords
shared cache
data page
time
storage capacity
capacity value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110428707.2A
Other languages
Chinese (zh)
Other versions
CN113177031B (en
Inventor
范国腾
尹强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingbase Information Technologies Co Ltd
Original Assignee
Beijing Kingbase Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingbase Information Technologies Co Ltd filed Critical Beijing Kingbase Information Technologies Co Ltd
Priority to CN202110428707.2A priority Critical patent/CN113177031B/en
Publication of CN113177031A publication Critical patent/CN113177031A/en
Application granted granted Critical
Publication of CN113177031B publication Critical patent/CN113177031B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The present disclosure relates to a processing method, apparatus, electronic device and medium for a database shared cache; wherein, the method comprises the following steps: receiving a shared cache suggestion request of a target database; wherein the shared cache suggestion request comprises a suggestion start time and a suggestion end time; acquiring a data page identification sequence of a shared cache region based on the suggested start time and the suggested end time; the data page identification sequence is a page identification set recorded by a database process accessing a shared cache region; and determining the storage capacity value of the shared cache region according to the data page identification sequence. The embodiment of the disclosure solves the problem that the stability of the determined storage capacity value is low due to the dependence on manual experience, so that the storage capacity value of the shared cache region is accurately estimated.

Description

Processing method and device for database shared cache, electronic equipment and medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for processing a shared cache of a database, an electronic device, and a medium.
Background
The database is basic application software, and along with wide popularization of new applications such as internet, big data, artificial intelligence and the like in commercial application, the application of the database is increasingly huge and complex, and the importance of performance expression of the database is more and more prominent; at present, in mainstream database products, shared data is accessed among multiple process instances by a technology of sharing a memory data buffer, so that the access efficiency of a shared memory becomes one of important factors influencing the performance of a database, and the storage capacity value of the shared memory directly influences the access efficiency. The storage capacity for the shared data buffer is currently preconfigured based on human experience.
The defects of the existing scheme are as follows: depending on subjectivity of the configuration personnel, the technical capability requirement on the configuration personnel is high, so that the stability of the storage capacity configuration of the shared cache region is low.
Disclosure of Invention
To solve the technical problem or at least partially solve the technical problem, the present disclosure provides a method, an apparatus, an electronic device, and a medium for processing a database shared cache.
In a first aspect, the present disclosure provides a method for processing a shared cache of a database, including:
receiving a shared cache suggestion request of a target database; wherein the shared cache suggestion request comprises a suggestion start time and a suggestion end time;
acquiring a data page identification sequence of a shared cache region based on the suggested start time and the suggested end time; the data page identification sequence is a page identification set recorded by a database process accessing the shared cache region;
and determining the storage capacity value of the shared cache region according to the data page identification sequence.
Optionally, the acquiring a data page identifier sequence of a shared cache area based on the recommended start time and the recommended end time includes:
judging whether the current acquisition time exceeds the recommended termination time or not;
and if so, acquiring a data page identification sequence from the suggested start time to the suggested end time from the shared cache region.
Optionally, after determining whether the current acquisition time exceeds the recommended termination time, the method further includes:
if not, acquiring a first identification sequence from the suggested starting time to the current acquisition time from a shared cache region; and obtaining the data page identification sequence of the shared cache region until the current acquisition time is the suggested termination time.
Optionally, the determining, according to the data page identification sequence, a storage capacity value of the shared cache region includes:
counting the number of data page identifications with non-repeated identifications in the data page identification sequence;
and taking the number of the data page identifications with non-repeated identifications as the storage capacity value of the shared cache region.
Optionally, after determining the storage capacity value of the shared cache region, the method further includes:
obtaining the times of the data page sequence hit by the shared cache in the data page identification sequence; acquiring the times of the data page sequence hit by the disk cache in the data page identification sequence;
determining the disk access time of the shared cache region under the storage capacity value;
and determining a target cache information table of the shared cache region according to the number of times of hit of the data page sequence by the shared cache, the number of times of hit of the data page sequence by the disk cache, the disk access time under the storage capacity value and the storage capacity value.
Optionally, before determining the disk access time of the shared cache under the storage capacity value, the method further includes:
acquiring actual access times of a disk cache and access time of the actual access times;
the determining the disk access time of the shared cache region under the storage capacity value includes:
and determining the disk access time of the shared cache region under the storage capacity value according to the actual access times, the access time of the actual access times and the times of the data page sequence hit by the disk cache.
Optionally, the method further includes:
and returning the storage capacity value of the shared cache region for carrying out selective configuration on the storage capacity of the shared cache region.
In a second aspect, the present disclosure further provides a processing apparatus for a database shared cache, including:
the request receiving module is used for receiving a shared cache suggestion request of a target database; wherein the shared cache suggestion request comprises a suggestion start time and a suggestion end time;
the sequence acquisition module is used for acquiring a data page identification sequence of the shared cache region based on the suggestion starting time and the suggestion ending time; the data page identification sequence is a page identification set recorded by a database process accessing the shared cache region;
and the storage capacity value determining module is used for determining the storage capacity value of the shared cache region according to the data page identification sequence.
Optionally, the sequence acquisition module includes: the device comprises a time judging unit and a sequence acquisition unit;
the time judging unit is used for judging whether the current acquisition time exceeds the recommended termination time or not;
and if so, acquiring a data page identification sequence from the suggested start time to the suggested end time from the shared buffer area.
Optionally, the sequence acquiring unit is further configured to acquire, if not, a first identifier sequence from the recommended start time to the current acquisition time from the shared cache region; and obtaining the data page identification sequence of the shared cache region until the current acquisition time is the suggested termination time.
Optionally, the storage capacity value determining module is specifically configured to;
counting the number of data page identifications with non-repeated identifications in the data page identification sequence;
and taking the number of the data page identifications with non-repeated identifications as the storage capacity value of the shared cache region.
Optionally, the method further includes: the device comprises a frequency acquisition module, a time determination module and an information table determination module;
the number obtaining module is used for obtaining the number of times of hit of the data page sequence by the shared cache in the data page identification sequence; acquiring the times of the data page sequence hit by the disk cache in the data page identification sequence;
the time determining module is used for determining the disk access time of the shared cache region under the storage capacity value;
and the information table determining module is used for determining a target cache information table of the shared cache region according to the number of times of hit of the data page sequence by the shared cache, the number of times of hit of the data page sequence by the disk cache, the disk access time under the storage capacity value and the storage capacity value.
Optionally, the method further includes:
the time acquisition module is used for acquiring the actual access times of the disk cache and the access time of the actual access times;
a time determination module specifically configured to:
and determining the disk access time of the shared cache region under the storage capacity value according to the actual access times, the access time of the actual access times and the times of the data page sequence hit by the disk cache.
Optionally, the method further includes:
and the storage capacity value returning module is used for returning the storage capacity value of the shared cache region and is used for selectively configuring the storage capacity of the shared cache region.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages: the storage capacity value can be determined according to the historical process data of the shared cache region, the problem that the stability of the determined storage capacity value is low due to the fact that manual experience is relied on is solved, and therefore the storage capacity value of the shared cache region is accurately estimated.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flowchart of a processing method for a shared cache of a database according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart illustrating another processing method for a shared cache of a database according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of another processing method for a shared cache of a database according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of element permutations in an array;
FIG. 5 is a schematic diagram of permutation during data access;
fig. 6 is a schematic structural diagram of a processing apparatus for sharing a cache in a database according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be further described below. It should be noted that the embodiments and features of the embodiments of the present disclosure may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
Fig. 1 is a schematic flowchart of a processing method for a shared cache of a database according to an embodiment of the present disclosure. The embodiment is applicable to the situation of estimating the storage capacity value of the shared cache region. The method of the embodiment may be performed by a processing device sharing a cache in a database, and the device may be implemented in hardware and/or software and may be configured in an electronic device. The processing method for the database shared cache in any embodiment of the application can be realized.
The shared memory cache region is a memory structure which can be accessed by a plurality of database instance processes, and is mainly used for caching data and reducing Input/Output (IO) overhead of a database; because the access speed of the memory is far higher than that of hardware equipment such as a disk and the like, the data access efficiency of most database products is improved in a secondary cache mode, namely, when data is read, a data block is read in from a data sharing buffer area in the shared memory firstly, and if the data block is not cached in the shared memory, the data block is read from the disk and is simultaneously stored in the shared memory for the next reading; if the data block exists in the shared memory, the data block is directly read from the memory, namely the shared memory is hit; if there is no redundant space in the shared memory, the data in the partial memory blocks is released by an algorithm to reuse the space, which is called shared memory replacement.
The higher the hit rate of the shared memory is, the better the performance of the database is; the size of the allocated shared memory directly affects the hit rate of the shared memory, and the hit rate increases with the increase of the buffer area, but at the same time, the improvement proportion of the hit rate also gradually decreases with the increase of the buffer area, so that the benefit of increasing the buffer area gradually decreases. Meanwhile, for the same database system, the size of the shared memory required by the system is different for different application scenes and different time periods; the blind allocation of the too large shared memory will occupy too many memory hardware resources, resulting in resource waste.
As shown in fig. 1, the method specifically includes the following steps:
s110, receiving a shared cache suggestion request of a target database; wherein the shared cache proposal request includes a proposal start time and a proposal end time.
In this embodiment, a user may implement an initial request for a shared memory by executing a Structured Query (SQ) statement suggested by the shared memory at a client; meanwhile, the client can indicate to end the request of the shared cache suggestion of the target database by executing the SQ statement again; the suggested starting time can be the execution time of the user for executing the SQ sentence for the first time through the client; the suggested termination time may be an execution time for the user to execute the SQ statement a second time through the client.
Specifically, the SQ statement can be referred to as follows.
“select*from sba.share_buffer_snapshot”。
It should be noted that, in this embodiment, the request for sharing cache suggestion for the target database is not limited to the implementation of the SQ statement, but may also include other implementable statements, which is not specifically limited herein.
S120, acquiring a data page identification sequence of the shared cache region based on the suggested start time and the suggested end time; the data page identification sequence is a page identification set recorded by the database process accessing the shared cache region.
In this embodiment, each database process in the target database accesses process data in the shared cache region to perform effective extraction of data; during the access process, the shared cache area records access information of each database process, such as identification sequence of accessed data pages, accessed data type, access time and other information.
The unit of the cache data in the shared cache region is a data page of the data file. For example, each data page of the data file is 8K in size, and if the configuration file shared memory is 128M in size, 128M/8K data pages can be cached in the shared memory. Each page takes 'tablespace + database + tablefile + tabletype + tableblock' as an accessed data page identifier; specifically, the tablespaces, databases, table files, table types, and tablock are all integer numbers, and the data page identifier formed by them may be a string of characters.
And S130, determining the storage capacity value of the shared cache region according to the data page identification sequence.
In this embodiment, the data page identifier sequence includes data page identifiers accessed by a plurality of database processes, including repeated data page identifiers and non-repeated data page identifiers; specifically, the storage capacity value of the shared cache region can be determined according to the number of the non-repeated data page identifications.
The method comprises the steps of receiving a shared cache suggestion request of a target database; wherein the shared cache suggestion request comprises a suggestion start time and a suggestion end time; acquiring a data page identification sequence of a shared cache region based on the suggested start time and the suggested end time; the data page identification sequence is a page identification set recorded by a database process accessing a shared cache region; and determining the storage capacity value of the shared cache region according to the data page identification sequence. The embodiment of the disclosure can determine the storage capacity value according to the historical process data of the shared cache region, and solves the problem that the stability of the determined storage capacity value is low due to the dependence on manual experience, so that the storage capacity value of the shared cache region can be accurately estimated.
Fig. 2 is a schematic flowchart of another processing method for a shared cache of a database according to an embodiment of the present disclosure. The embodiment is further expanded and optimized on the basis of the embodiment, and can be combined with any optional alternative in the technical scheme. As shown in fig. 2, the method includes:
s210, receiving a shared cache suggestion request of a target database; wherein the shared cache proposal request includes a proposal start time and a proposal end time.
S220, judging whether the current acquisition time exceeds the recommended termination time; if yes, go to S230; if not, go to S240.
And S230, collecting a data page identification sequence from the suggested start time to the suggested end time from the shared buffer area.
In this embodiment, since each database process has a high access frequency to the data stored in the shared cache region, in order to improve the access efficiency of the cache data, this embodiment uses a secondary cache mode to store the recorded shared memory access sequence (i.e., the data page identification sequence).
When data collection is started, processes receiving collection requests in a target database store initial snapshots of data page identifiers in a first disk file (such as a custom file sba _ init), during collection, each database process records shared memory access requests in a process local memory, and when the local memory reaches a threshold value or a timer in the database process expires (such as set time), the shared memory access requests are flushed into a second disk file (such as a custom file sba _ track); thereby implementing a second level cache of data.
Exemplarily, the data acquisition process of the shared memory is started as an example for description; checking whether sba _ init files exist currently, and if not, judging that a user executes a shared cache suggestion request for the first time; applying for obtaining all partition locks in the shared memory, and locking the shared memory so as to prevent the problem of incomplete data acquisition caused by data acquisition of other processes in the data acquisition process; traversing a description array (such as bufferDesc) of the shared memory, and saving each data page identifier (BufferTag) in the shared memory to a process local memory array (such as tags); releasing the shared memory lock; and storing the data tags in the shared memory into an sba _ init file, and releasing the memory to finish effective acquisition of the data.
According to the embodiment, the data page identification sequence can be acquired in the shared cache region based on the access time of the data page identification, so that the accurate and rapid composition of the data page identification sequence is realized.
S240, collecting a first identification sequence from the suggested starting time to the current collecting time from the shared buffer area; and obtaining the data page identification sequence of the shared cache region until the current acquisition time is the recommended termination time.
In this embodiment, when performing collection, the name of the data storage file needs to be changed, so as to avoid the influence of collection requests of other processes on the collection process.
Illustratively, the first disk file is sba _ init; the second disk file is sba _ track for example; after the sba _ init file and sba _ track are judged to exist, the file names of the two files are changed to obtain sba _ init _ temp and sba _ track _ temp, so that other processes are prevented from writing again.
The first identification sequence is an identification of a data page acquired from the suggested start time to the current acquisition time, and the current acquisition time does not exceed the suggested end time, so that the acquisition process is not finished until the current acquisition time is transited to the suggested end time. Illustratively, with a suggested start time of 2021, 4 months, 20 days 9: 00; the recommended termination time is 2021 year, 4 month, 20 days 12: 00; the current collection time is 2021 year, 4 month, 20 days 10: 00; the first identification sequence is a data page identification sequence acquired at 20 days 9:00-10:00 of 4 months in 2021; and acquiring a second identification sequence until the current acquisition time is transited from 20/10: 00/4/20/2021 to 20/12: 00/4/2021, wherein the second identification sequence is a data page identification sequence acquired from 10/00/20/2021, i.e. the data page identification sequence of the shared cache region from the suggestion starting time to the suggestion ending time can be obtained according to the first identification sequence and the second identification sequence.
In the embodiment, when the data page identification sequence is acquired, the suggested termination time is compared with the current time so as to accurately acquire the time, thereby avoiding the problem of incomplete acquired information caused by the fact that the acquisition time does not reach the suggested termination time.
And S250, counting the number of the data page identifications which do not repeat in the identification sequence of the data page.
In this embodiment, the data page identifiers stored in the shared cache area may be the same or different; the number of the data page identifications with non-repeated identifications can effectively reflect the actual storage performance of the shared cache region.
And S260, taking the number of the data page identifications with non-repeated identifications as the storage capacity value of the shared cache region.
Illustratively, the data page identification sequences stored in the shared cache include "10112", "12003", "11520", "10112", "22321", and "12314"; each data page identifier is a data identifier of one page in the shared cache region, and the size of each data page identifier can be 8K; then, the number of data page identifiers whose identifiers do not overlap is 5, and the storage capacity value of the shared buffer is 5 × 8K.
Fig. 3 is a schematic flowchart of another processing method for a shared cache of a database according to an embodiment of the present disclosure. The embodiment is further expanded and optimized on the basis of the embodiment, and can be combined with any optional alternative in the technical scheme. As shown in fig. 2, the method includes:
s310, receiving a shared cache suggestion request of a target database; wherein the shared cache proposal request includes a proposal start time and a proposal end time.
S320, acquiring a data page identification sequence of the shared cache region based on the suggested start time and the suggested end time; the data page identification sequence is a page identification set recorded by the database process accessing the shared cache region.
S330, determining the storage capacity value of the shared cache region according to the data page identification sequence.
S340, obtaining the times of the shared cache hit of the data page sequence in the data page identification sequence; and obtaining the times of the hit of the data page sequence by the disk cache in the data page identification sequence.
In this embodiment, when the data page sequence is hit by the shared cache as the acquisition data, the acquisition is directly performed in the shared cache region; when the data page sequence is hit by the disk cache to be the acquired data, the data page sequence is not searched in the shared cache region, but is searched in the disk.
This can be achieved by creating a hash table and an array, as shown in the following example.
And establishing an array and a hash table for the size value of each shared memory. The calculation speed is improved by simulating a storage management (such as Least Central Used (LRU)) linked list by arrays; the number of the arrays is as follows: memory value size/data page size + 1; the elements in the array are: prev is the subscript of the last element in the LRU linked list, buffer tag is the shared cache identification, and next is the subscript of the next element in the LRU linked list; the 0 th element of the array does not participate in the LRU linked list, but is used for recording the positions of the first and last elements of the linked list; taking the member stored in the next value of the 0 th element for replacement at each time; and the Key of the Hash table is a BufferTag, the value is a subscript value in an array, and the BufferTag is searched in the Hash table so as to quickly obtain the position of the data block in the analog shared memory.
Referring to FIG. 4, FIG. 4 is a schematic diagram of element permutation in an array; the initial elements are: p1, P2, P3, P4, P5 and P6, when element P3 is accessed, its position is converted, the elements in the new array are: p1, P2, P4, P5, P6 and P3.
FIG. 5 is a schematic diagram of permutation during data access; wherein, P1 is accessed, and the data items are P1, P2 and P3 in sequence from old to new according to the access sequence; if the next data page accessed by the input sequence is P1, the number 1 of P1 in the LRU array can be taken according to the identification value of P1 in the hash table, and the LRU data pages are changed into P2, P3 and P1 from old to new after calculation; at this time, if the next data page accessed by the input sequence is P4 and the maximum length of the array is 3, P2 in the LRU array is replaced by P4, and the arrangement of the data pages in the LRU from old to new is changed to P3, P1 and P4.
Specifically, the non-repetitive BufferTag stored in sba _ init _ temp is added into the array and the Hash table to form the initial state of the shared memory; sequentially reading sba _ track _ temp saved buffer tag, reading from the Hash table, if the value can be read, considering that the shared cache is hit, and moving the read array element to the last of the LRU linked list; if no value exists in the Hash table, one IO reading is considered to exist; if the elements stored in the array do not reach the maximum value, a new element is allocated and placed into the last of the LRU linked list and into the Hash table at the same time. If there is no value in the Hash table and the element in the array has reached the maximum value, then a shared memory replacement is deemed to exist, the least recently used array element, i.e., the first element of the LRU list, is taken from the LRU list, the BufferTag value for that element is deleted from the Hash table, the BufferTag value for that element is replaced with the BufferTag currently read from sba _ track _ temp, and the element is moved to the tail of the LRU list.
In this embodiment, optionally, before determining the disk access time of the shared cache region under the storage capacity value, the method of this embodiment further includes:
and acquiring the actual access times of the disk cache and the access time of the actual access times.
The actual access times of the disk cache are the times acquired in the disk file during data acquisition; data writing and data reading can be included; the access time of the actual access times is the data write time and the data read time.
And S350, determining the disk access time of the shared cache region under the storage capacity value.
In this embodiment, optionally, determining the disk access time of the shared cache area under the storage capacity value includes:
and determining the disk access time of the shared cache region under the storage capacity value according to the actual access times, the access time of the actual access times and the times of the data page sequence hit by the disk cache.
The disk access time can comprise writing time and reading time; the disk access time of the shared cache region under the storage capacity value can be accurately determined according to the actual access times, the access time of the actual access times and the times of hitting the data page sequence by the disk cache.
The following formula can be specifically referred to.
Figure BDA0003030590540000121
Wherein, TrecommendThe disk access time of the shared cache region under the storage capacity value is obtained; t isIOAn access time that is an actual number of accesses; n is a radical ofIOThe number of times of hit of the data page sequence by the disk cache is obtained; n is a radical ofAccessIs the actual number of accesses.
And S360, determining a target cache information table of the shared cache region according to the number of times of hit of the data page sequence by the shared cache, the number of times of hit of the data page sequence by the disk cache, the disk access time under the storage capacity value and the storage capacity value.
In this embodiment, the determined target cache information table of the shared cache region may be used to facilitate the user to visually know the access performance of the shared cache region for different storage capacity values according to the number of times that the data page sequence is hit by the shared cache, the number of times that the data page sequence is hit by the disk cache, the disk access time under the storage capacity value, and the storage capacity value.
Specifically, on the basis of the determined storage capacity value of the shared cache region, different percentages of the storage capacity value are respectively taken, and the cache hit rate, the cache miss rate, the cache replacement times and the IO access time of the system under the configuration of the corresponding shared cache storage capacity values under different percentages are calculated.
Illustratively, percentages are given as 100%, 80%, 50%, 30% and 20%; see table 1 below for details.
Table 1 target cache information table with different storage capacity values corresponding to shared cache region
Figure BDA0003030590540000131
Wherein, the logic reading is reading in the shared cache region; physical reading is reading in a disk; determining the storage capacity value of the shared cache region to be 644 MB; a percentage of 80% corresponds to a storage capacity value of 515 MB; a 50% percentage corresponds to a memory capacity value of 322 MB; a percentage of 30% corresponds to a storage capacity value of 193 MB; a percentage of 20% corresponds to a storage capacity value of 128 MB.
The target cache information table can be returned to the user, so that the user can intuitively and effectively know the income conditions of the access data corresponding to different storage capacity values.
In this embodiment, optionally, the method of this embodiment further includes:
and returning the storage capacity value of the shared cache region for selectively configuring the storage capacity of the shared cache region.
Wherein the storage capacity values correspond to different proposed times; and returning the suggested storage capacity values at different times to the user so that the user can selectively configure the storage capacity of the target database by referring to the storage capacity values.
When a database system is started, starting an acquisition process, and executing acquisition once every 5 seconds by the process; recording the size of sba _ track file for each acquisition; reading sba _ track files from sba _ track size recorded last time as a starting position in each acquisition; putting the BufferTag in the sba _ track file into a hash (hash) table, wherein the hash Key is the BufferTag; ensuring to record the nonrepeating BufferTag by utilizing the characteristic that Key is nonrepeated; and storing the acquisition time and the number of elements of the hash table into a data table.
See table 2 below for feedback to the user.
TABLE 2 schematic table of memory capacity values
Figure BDA0003030590540000141
Fig. 6 is a schematic structural diagram of a processing apparatus for sharing a cache in a database according to an embodiment of the present disclosure; the device is configured in the electronic equipment, and can realize the processing method of the database shared cache in any embodiment of the application. The device specifically comprises the following steps:
a request receiving module 610, configured to receive a request for a shared cache suggestion of a target database; wherein the shared cache suggestion request comprises a suggestion start time and a suggestion end time;
a sequence collection module 620, configured to collect a data page identifier sequence of the shared cache area based on the suggested start time and the suggested end time; the data page identification sequence is a page identification set recorded by a database process accessing the shared cache region;
a storage capacity value determining module 630, configured to determine a storage capacity value of the shared cache region according to the data page identifier sequence.
In this embodiment, optionally, the sequence acquisition module 620 includes: the device comprises a time judging unit and a sequence acquisition unit;
the time judging unit is used for judging whether the current acquisition time exceeds the recommended termination time or not;
and if so, acquiring a data page identification sequence from the suggested start time to the suggested end time from the shared buffer area.
In this embodiment, optionally, the sequence acquiring unit is further configured to acquire, if not, a first identifier sequence from the recommended start time to the current acquisition time from the shared cache region; and obtaining the data page identification sequence of the shared cache region until the current acquisition time is the suggested termination time.
In this embodiment, optionally, the storage capacity value determining module 630 is specifically configured to;
counting the number of data page identifications with non-repeated identifications in the data page identification sequence;
and taking the number of the data page identifications with non-repeated identifications as the storage capacity value of the shared cache region.
In this embodiment, optionally, the apparatus of this embodiment further includes: the device comprises a frequency acquisition module, a time determination module and an information table determination module;
the number obtaining module is used for obtaining the number of times of hit of the data page sequence by the shared cache in the data page identification sequence; acquiring the times of the data page sequence hit by the disk cache in the data page identification sequence;
the time determining module is used for determining the disk access time of the shared cache region under the storage capacity value;
and the information table determining module is used for determining a target cache information table of the shared cache region according to the number of times of hit of the data page sequence by the shared cache, the number of times of hit of the data page sequence by the disk cache, the disk access time under the storage capacity value and the storage capacity value.
In this embodiment, optionally, the apparatus of this embodiment further includes:
the time acquisition module is used for acquiring the actual access times of the disk cache and the access time of the actual access times;
a time determination module specifically configured to:
and determining the disk access time of the shared cache region under the storage capacity value according to the actual access times, the access time of the actual access times and the times of the data page sequence hit by the disk cache.
In this embodiment, optionally, the apparatus of this embodiment further includes:
and the storage capacity value returning module is used for returning the storage capacity value of the shared cache region and is used for selectively configuring the storage capacity of the shared cache region.
By the processing device of the database shared cache, the storage capacity value can be determined according to the historical process data of the shared cache region, the problem that the stability of the determined storage capacity value is low due to the dependence on manual experience is solved, and therefore the storage capacity value of the shared cache region is accurately estimated.
The processing device for the database shared cache provided by the embodiment of the invention can execute the processing method for the database shared cache provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 7, the electronic apparatus includes a processor 710, a memory 720, an input device 730, and an output device 740; the number of the processors 710 in the electronic device may be one or more, and one processor 710 is taken as an example in fig. 7; the processor 710, the memory 720, the input device 730, and the output device 740 in the electronic apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 7.
The memory 720 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the processing method of the database-shared cache in the embodiment of the present invention. The processor 710 executes software programs, instructions and modules stored in the memory 720, so as to execute various functional applications and data processing of the electronic device, that is, implement the processing method of the database shared cache provided by the embodiment of the present invention.
The memory 720 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 720 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 720 may further include memory located remotely from the processor 710, which may be connected to an electronic device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 730 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, and may include a keyboard, a mouse, and the like. The output device 740 may include a display device such as a display screen.
The embodiment of the disclosure also provides a storage medium containing computer executable instructions, and the computer executable instructions are used for realizing the processing method of the database shared cache provided by the embodiment of the invention when being executed by a computer processor.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the database shared cache processing method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the above search apparatus, each included unit and module are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present disclosure, which enable those skilled in the art to understand or practice the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A processing method for a database shared cache is characterized by comprising the following steps:
receiving a shared cache suggestion request of a target database; wherein the shared cache suggestion request comprises a suggestion start time and a suggestion end time;
acquiring a data page identification sequence of a shared cache region based on the suggested start time and the suggested end time; the data page identification sequence is a page identification set recorded by a database process accessing the shared cache region;
and determining the storage capacity value of the shared cache region according to the data page identification sequence.
2. The method of claim 1, wherein collecting a sequence of data page identifications for a shared buffer based on the suggested start time and the suggested end time comprises:
judging whether the current acquisition time exceeds the recommended termination time or not;
and if so, acquiring a data page identification sequence from the suggested start time to the suggested end time from the shared cache region.
3. The method of claim 2, wherein after determining whether the current acquisition time exceeds the recommended expiration time, the method further comprises:
if not, acquiring a first identification sequence from the suggested starting time to the current acquisition time from a shared cache region; and obtaining the data page identification sequence of the shared cache region until the current acquisition time is the suggested termination time.
4. The method of claim 1, wherein determining the storage capacity value of the shared buffer according to the data page identification sequence comprises:
counting the number of data page identifications with non-repeated identifications in the data page identification sequence;
and taking the number of the data page identifications with non-repeated identifications as the storage capacity value of the shared cache region.
5. The method of claim 1, wherein after determining the storage capacity value of the shared buffer, the method further comprises:
obtaining the times of the data page sequence hit by the shared cache in the data page identification sequence; acquiring the times of the data page sequence hit by the disk cache in the data page identification sequence;
determining the disk access time of the shared cache region under the storage capacity value;
and determining a target cache information table of the shared cache region according to the number of times of hit of the data page sequence by the shared cache, the number of times of hit of the data page sequence by the disk cache, the disk access time under the storage capacity value and the storage capacity value.
6. The method of claim 5, wherein prior to determining the disk access time of the shared cache at the storage capacity value, the method further comprises:
acquiring actual access times of a disk cache and access time of the actual access times;
the determining the disk access time of the shared cache region under the storage capacity value includes:
and determining the disk access time of the shared cache region under the storage capacity value according to the actual access times, the access time of the actual access times and the times of the data page sequence hit by the disk cache.
7. The method of claim 1, further comprising:
and returning the storage capacity value of the shared cache region for carrying out selective configuration on the storage capacity of the shared cache region.
8. A processing apparatus for sharing a cache in a database, the apparatus comprising:
the request receiving module is used for receiving a shared cache suggestion request of a target database; wherein the shared cache suggestion request comprises a suggestion start time and a suggestion end time;
the sequence acquisition module is used for acquiring a data page identification sequence of the shared cache region based on the suggestion starting time and the suggestion ending time; the data page identification sequence is a page identification set recorded by a database process accessing the shared cache region;
and the storage capacity value determining module is used for determining the storage capacity value of the shared cache region according to the data page identification sequence.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of processing a database shared cache according to any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method for processing the shared cache of the database according to any one of claims 1 to 7.
CN202110428707.2A 2021-04-21 2021-04-21 Processing method and device for database shared cache, electronic equipment and medium Active CN113177031B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110428707.2A CN113177031B (en) 2021-04-21 2021-04-21 Processing method and device for database shared cache, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110428707.2A CN113177031B (en) 2021-04-21 2021-04-21 Processing method and device for database shared cache, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN113177031A true CN113177031A (en) 2021-07-27
CN113177031B CN113177031B (en) 2023-08-01

Family

ID=76923954

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110428707.2A Active CN113177031B (en) 2021-04-21 2021-04-21 Processing method and device for database shared cache, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN113177031B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020087500A1 (en) * 1998-08-18 2002-07-04 Brian T. Berkowitz In-memory database system
JP2013152539A (en) * 2012-01-24 2013-08-08 Toshiba Corp Memory device, memory device control method and memory device control program
CN106021370A (en) * 2016-05-11 2016-10-12 智者四海(北京)技术有限公司 Memory database instance management method and device
CN109542912A (en) * 2018-12-04 2019-03-29 北京锐安科技有限公司 Interval censored data storage method, device, server and storage medium
CN110555001A (en) * 2019-09-05 2019-12-10 腾讯科技(深圳)有限公司 data processing method, device, terminal and medium
CN110704336A (en) * 2019-09-26 2020-01-17 北京神州绿盟信息安全科技股份有限公司 Data caching method and device
CN110889629A (en) * 2019-11-27 2020-03-17 陕西格物实业有限公司 Aerial, underwater and outdoor convergence, distribution and fusion media system for unmanned aerial vehicle
CN111190655A (en) * 2019-12-30 2020-05-22 中国银行股份有限公司 Processing method, device, equipment and system for application cache data
CN111309720A (en) * 2018-12-11 2020-06-19 北京京东尚科信息技术有限公司 Time sequence data storage method, time sequence data reading method, time sequence data storage device, time sequence data reading device, electronic equipment and storage medium
US20200257628A1 (en) * 2017-11-22 2020-08-13 Intel Corporation File pre-fetch scheduling for cache memory to reduce latency
CN111737168A (en) * 2020-06-24 2020-10-02 华中科技大学 Cache system, cache processing method, device, equipment and medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020087500A1 (en) * 1998-08-18 2002-07-04 Brian T. Berkowitz In-memory database system
JP2013152539A (en) * 2012-01-24 2013-08-08 Toshiba Corp Memory device, memory device control method and memory device control program
CN106021370A (en) * 2016-05-11 2016-10-12 智者四海(北京)技术有限公司 Memory database instance management method and device
US20200257628A1 (en) * 2017-11-22 2020-08-13 Intel Corporation File pre-fetch scheduling for cache memory to reduce latency
CN109542912A (en) * 2018-12-04 2019-03-29 北京锐安科技有限公司 Interval censored data storage method, device, server and storage medium
CN111309720A (en) * 2018-12-11 2020-06-19 北京京东尚科信息技术有限公司 Time sequence data storage method, time sequence data reading method, time sequence data storage device, time sequence data reading device, electronic equipment and storage medium
CN110555001A (en) * 2019-09-05 2019-12-10 腾讯科技(深圳)有限公司 data processing method, device, terminal and medium
CN110704336A (en) * 2019-09-26 2020-01-17 北京神州绿盟信息安全科技股份有限公司 Data caching method and device
CN110889629A (en) * 2019-11-27 2020-03-17 陕西格物实业有限公司 Aerial, underwater and outdoor convergence, distribution and fusion media system for unmanned aerial vehicle
CN111190655A (en) * 2019-12-30 2020-05-22 中国银行股份有限公司 Processing method, device, equipment and system for application cache data
CN111737168A (en) * 2020-06-24 2020-10-02 华中科技大学 Cache system, cache processing method, device, equipment and medium

Also Published As

Publication number Publication date
CN113177031B (en) 2023-08-01

Similar Documents

Publication Publication Date Title
CN107491523B (en) Method and device for storing data object
CN110489405B (en) Data processing method, device and server
CN109766318B (en) File reading method and device
CN112148736B (en) Method, device and storage medium for caching data
CN110413545B (en) Storage management method, electronic device, and computer program product
CN111506604A (en) Method, apparatus and computer program product for accessing data
KR101806394B1 (en) A data processing method having a structure of the cache index specified to the transaction in a mobile environment dbms
CN111158601A (en) IO data flushing method, system and related device in cache
CN111177090A (en) Client caching method and system based on sub-model optimization algorithm
WO2016175880A1 (en) Merging incoming data in a database
CN115827702B (en) Software white list query method based on bloom filter
CN111694806A (en) Transaction log caching method, device, equipment and storage medium
CN111221468B (en) Storage block data deleting method and device, electronic equipment and cloud storage system
CN111913913A (en) Access request processing method and device
CN113177031B (en) Processing method and device for database shared cache, electronic equipment and medium
JP2001282599A (en) Method and device for managing data and recording medium with data management program stored therein
CN116244214A (en) Model parameter acquisition method and device, server device and storage medium
CN110413617B (en) Method for dynamically adjusting hash table group according to size of data volume
CN111104435B (en) Metadata organization method, device and equipment and computer readable storage medium
CN109828720B (en) Data storage method, device, server and storage medium
CN113139002A (en) Hot spot data caching method based on Redis
CN115826886B (en) Data garbage collection method, device and system in additional write mode and storage medium
CN115374301B (en) Cache device, method and system for realizing graph query based on cache device
CN117056363B (en) Data caching method, system, equipment and storage medium
CN110334251B (en) Element sequence generation method for effectively solving rehash conflict

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant