CN109981774B - Data caching method and data caching device - Google Patents
Data caching method and data caching device Download PDFInfo
- Publication number
- CN109981774B CN109981774B CN201910226551.2A CN201910226551A CN109981774B CN 109981774 B CN109981774 B CN 109981774B CN 201910226551 A CN201910226551 A CN 201910226551A CN 109981774 B CN109981774 B CN 109981774B
- Authority
- CN
- China
- Prior art keywords
- data
- pieces
- cache
- disk area
- caching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0844—Multiple simultaneous or quasi-simultaneous cache accessing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The present disclosure provides a data caching method. The method comprises the following steps: acquiring a storage corresponding relation of each piece of data in first cache data cached from a first disk area to a memory, wherein the first cache data is data cached from the first disk area to the memory at a first moment; acquiring one or more pieces of data with changed data content in the first disk area in a first period, wherein the first period lasts from the first moment to the current time; determining cache information of the one or more pieces of data in the memory based on the storage corresponding relation; and re-caching the one or more pieces of data based on the caching information of the one or more pieces of data to update the first cached data. The present disclosure also provides a data caching device.
Description
Technical Field
The present disclosure relates to a data caching method and a data caching apparatus.
Background
Existing compute engines, such as Spark, typically cache all data for a service in its entirety each time the data is cached. However, in actual services, data is often dynamically changed, such as daily incremental service data, and not all data is often changed synchronously. However, the computing engine cannot perform cache update on the changed partial data while retaining the unchanged partial cache data. The prior art updates the cached data by integrally caching all data of one service, which results in that all the originally cached data is covered or discarded. And with the increasing data volume and the dynamic change of the data, if all data are cached again each time, a large cluster burden is caused, and the normal operation of the service is affected.
Disclosure of Invention
One aspect of the present disclosure provides a data caching method. The data caching method comprises the following steps: acquiring a storage corresponding relation of each piece of data in first cache data cached from a first disk area to a memory, wherein the first cache data is data cached from the first disk area to the memory at a first moment; acquiring one or more pieces of data with changed data content in the first disk area in a first period, wherein the first period lasts from the first moment to the current time; determining cache information of the one or more pieces of data in the memory based on the storage corresponding relation; and re-caching the one or more pieces of data based on the caching information of the one or more pieces of data to update the first cached data.
Optionally, the obtaining one or more pieces of data of which data contents in the first disk area change during the first period includes monitoring a data query operation performed on the first disk area during the first period, and obtaining the one or more pieces of data based on the data query operation.
Optionally, the obtaining one or more pieces of data of which data contents are changed in the first disk area in the first period includes obtaining a log recording operations performed on the first disk area in the first period, analyzing the log to obtain operation information for changing data contents, and obtaining the one or more pieces of data based on the operation information.
Optionally, the obtaining one or more pieces of data of which data contents in the first disk area change in the first period includes recording, in real time, time information of a last content change operation performed on each piece of data in the first disk area, and obtaining, based on the time information, all data of a time of the last content change operation in the first period to obtain the one or more pieces of data.
Optionally, the re-caching the one or more pieces of data based on the caching information of the one or more pieces of data includes: determining a data partition or a data block of the one or more pieces of data in the memory based on the cache information of the one or more pieces of data; and integrally caching the data which is currently stored in the first disk area and corresponds to the data partition or the data block.
Optionally, the re-caching the one or more pieces of data based on the caching information of the one or more pieces of data includes re-caching each piece of data of the one or more pieces of data according to corresponding caching information, respectively.
Optionally, the re-caching each piece of the one or more pieces of data according to the corresponding caching information respectively includes: when the one or more pieces of data comprise content updating data of original data in the first disk area, replacing the original data cached in the memory with the content updating data; when the one or more pieces of data comprise newly added data which is newly added in the first disk area, adding a cache of the newly added data in the memory; and when the one or more pieces of data comprise original data of deleted contents from the first disk area, deleting the original data cached in the memory.
Another aspect of the present disclosure provides a data caching apparatus. The data caching device comprises a corresponding relation obtaining module, a disk data change obtaining module, a caching change determining module and a caching updating module. The corresponding relation obtaining module is configured to obtain a storage corresponding relation between each piece of data in first cache data cached from a first disk area to a memory, where the first cache data is data cached from the first disk area to the memory at a first time. The disk data change acquiring module is used for acquiring one or more pieces of data with changed data contents in the first disk area in a first period, wherein the first period lasts from the first moment to the current moment. The cache change determining module is configured to determine cache information of the one or more pieces of data based on the storage correspondence. The cache updating module is used for re-caching the one or more pieces of data based on the cache information of the one or more pieces of data so as to update the first cache data. .
Optionally, the disk data change acquiring module is specifically configured to: monitoring data query operation executed on the first disk area in the first period; and acquiring the one or more pieces of data based on the data query operation.
Optionally, the disk data change acquiring module is specifically configured to: acquiring a record log recording operations executed on the first disk area in the first period; analyzing the recorded log to acquire operation information for changing the data content; and acquiring the one or more pieces of data based on the operation information.
Optionally, the disk data change acquiring module is specifically configured to: recording time information of the latest content change operation executed on each piece of data in the first disk area in real time; and acquiring all data of the latest time of the content change operation in the first period based on the time information to obtain the one or more pieces of data.
Optionally, the cache update module is specifically configured to: and determining a data partition or a data partition to which the one or more pieces of data belong in the memory based on the cache information of the one or more pieces of data, and integrally re-caching the data, which is currently stored in the first disk area and corresponds to the data partition or the data partition.
Optionally, the cache updating module is specifically configured to re-cache each piece of the one or more pieces of data according to corresponding cache information.
Optionally, the re-caching each piece of the one or more pieces of data according to the corresponding caching information respectively includes: when the one or more pieces of data comprise content updating data of original data in the first disk area, replacing the original data cached in the memory with the content updating data; when the one or more pieces of data comprise newly added data which is newly added in the first disk area, adding a cache of the newly added data in the memory; and when the one or more pieces of data comprise original data of deleted contents from the first disk area, deleting the original data cached in the memory.
Another aspect of the present disclosure provides a data caching system. The data caching system includes one or more memories storing computer-executable instructions and one or more processors. The processor executes the instructions to implement the data caching method as described above.
Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions for implementing the data caching method as described above when executed.
Another aspect of the present disclosure provides a computer program comprising computer executable instructions for implementing the data caching method as described above when executed.
Drawings
For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
fig. 1 schematically illustrates an application scenario of a data caching method and a data caching apparatus according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow diagram of a data caching method according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a method of obtaining data having changed data content in a first volume area according to an embodiment of the present disclosure;
FIG. 4 is a flow diagram that schematically illustrates a method of obtaining data having a changed data content in a first volume area, in accordance with another embodiment of the present disclosure;
FIG. 5 is a flow diagram that schematically illustrates a method of obtaining data having a changed data content in a first volume area, in accordance with yet another embodiment of the present disclosure;
FIG. 6 schematically illustrates a flow diagram of a method of re-caching partial data according to an embodiment of the present disclosure;
FIG. 7 schematically shows a block diagram of a data caching apparatus according to an embodiment of the present disclosure; and
FIG. 8 schematically illustrates a block diagram of a computer system suitable for implementing data caching, according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon for use by or in connection with an instruction execution system.
When a large data operation is performed by a computing engine, in order to improve data processing efficiency, some data (for example, data that needs to be used in a repeat operation or an iterative operation) is generally cached in a memory, so as to improve processing efficiency. The embodiment of the disclosure provides a data caching method and a data caching device. The data caching method comprises the following steps: acquiring a storage corresponding relation of each piece of data in first cache data cached from a first disk area to a memory, wherein the first cache data is data cached from the first disk area to the memory at a first moment; acquiring one or more pieces of data with changed data content in the first disk area in a first period, wherein the first period lasts from the first moment to the current time; determining cache information of the one or more pieces of data based on the storage correspondence; and re-caching the one or more pieces of data based on the caching information of the one or more pieces of data to update the first cached data.
According to the data caching method and device, only partial data with changed data content in the first disk area can be cached again when cache updating is carried out. Compared with the prior art that all data in the first disk area must be cached again in each caching, the data caching method and the data caching device according to the embodiment of the disclosure can greatly reduce the data volume of each caching, and can shorten the time required by each caching, so that the time interval between two adjacent caching can be reduced, the real-time performance of data caching is improved, and the caching efficiency of the first disk area in actual service is greatly improved.
Fig. 1 schematically illustrates an application scenario 100 of a data caching method and a data caching apparatus according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of an application scenario in which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, but does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the application scenario 100 includes a first disk area 101, a memory 102, and a calculation engine 103. The data in the first disk area 101 is cached in the memory 102, and when the content of the data in the first disk area 101 changes, a part of the changed data may be cached again, so as to synchronize the data in the first disk area 101 with the cached data in the memory 102. The compute engine 103 may read data cached in the memory 102 for data processing. The computation engine 103 may be spark, for example. In some embodiments, the compute engine 103 may also read data from the first disk region 101.
The first disk area 101 may be, for example, a storage area of various service databases, where the various service databases may be, for example, relational databases (e.g., MySQL database, Oracle database, SQL server database, or the like), or non-relational databases. Accordingly, the data in the first disk area 101 may be data in various service databases. Alternatively, in other embodiments, the first disk region 101 may be, for example, a data storage region corresponding to a data warehouse, such as a distributed file system (e.g., hadoop distributed file system). Accordingly, the data in the first disk region 101 may be, for example, a hive table in a distributed file system or the like.
According to the data caching method and device of the embodiment of the disclosure, according to the storage correspondence relationship of the data cached from the first disk area 101 to the memory 102, when the data content of the partial data in the first disk area 101 changes, the partial cached data in the memory 102 is updated correspondingly, so that the synchronization of the data in the first disk area 101 and the cached data in the memory 102 is realized. For example, in the case of caching, data a, data B, and data C in the first disk area 101 in fig. 1 are cached in the data block a, the data block B, and the data block C, respectively. According to the embodiment of the present disclosure, it is assumed that when only data a of the first disk area 101 changes, only the cache data in the data block a may be updated. Therefore, the synchronization between the cache data in the memory 102 and the data in the first disk area 101 can be realized quickly, the data amount during cache updating is reduced obviously, the real-time performance of data caching is improved, and the timeliness and the accuracy of business analysis performed by the computation engines 103 such as spark and the like are improved greatly.
Fig. 2 schematically shows a flow chart of a data caching method according to an embodiment of the present disclosure.
As shown in fig. 2, the data caching method may include operations S210 to S240.
In operation S210, a storage correspondence relationship between each piece of data in first cache data cached from the first disk area 101 to the memory 102 is obtained, where the first cache data is data cached from the first disk area 101 to the memory at a first time.
Specifically, for example, all data related information in the first cache data, including original storage information of each piece of data in the first disk area 101, cache information in the memory 102, and the like, may be recorded to obtain the storage correspondence. The original storage information may be, for example, storage location information of each piece of data (e.g., data A, B, C shown in fig. 1) in the first disk area 101, including storage path information of a database or a data table where each piece of data is located, a storage location (e.g., row or column information) of each piece of data in the database or the data table, storage format information, and the like. For example, data a is stored in a hive table in the first disk area 101. The original information of the data a may include the storage path information of the hive table, the row information and the column field information of the data a in the hive table, and the like. The cache information may be, for example, which partition of the memory 102 each piece of data is cached in, or which data block of the partition, and information such as a cache format. For example, in fig. 1, data a is buffered in data block a, data B is buffered in data block B, and data C is buffered in data block C. Further, according to the original storage information and the cache information of each piece of data, a storage correspondence relationship from the first disk area 101 to the memory 102 of each piece of data can be established.
In operation S220, one or more pieces of data whose data contents have changed in the first disk area 101 during a first period, which is a period lasting from the first time to the current time, are acquired. The first period is a time interval between two adjacent buffers, and the specific time length can be determined according to the service requirement.
Specifically, for example, the data change in the first disk area 101 during the first period may be dynamically monitored to obtain the first or more pieces of data. Wherein, the data content change comprises data content update, newly added data or data deletion. For example, the data to be cached is data in a sales data table in the first disk area 101. The data content is updated, and may be that the values of some or all fields (for example, the sales volume field) in a certain piece of data in the sales volume data table are updated, for example, the sales volume of a certain product is increased as time goes by, and the value of the sales volume field is updated. When a new product is listed, for example, a new sales data table may be added with a piece of sales data for recording the new product. Data is deleted, for example, when a product is placed next, sales data for recording the product placed next is deleted from the sales data table.
In operation S230, cache information of the one or more pieces of data in the memory 102 is determined based on the storage correspondence. For example, according to the original storage information of the one or more pieces of data, the cache information of the one or more pieces of data in the memory 102 is determined.
In operation S240, the one or more pieces of data are re-cached based on the caching information of the one or more pieces of data to update the first cached data. The method can be used for re-caching and updating the changed part of data, and specific implementation can comprise two types of situations, namely coarse-grained updating and fine-grained updating. The coarse-grained updating may be, for example, updating the entire cache of the data in the data partition or the data block to which the one or more pieces of data belong in the memory. The fine-grained updating may be, for example, locating information, such as a specific storage location of each piece of the first piece of data in the memory 102, and then re-caching each piece of the first piece of data in the memory 102.
According to the embodiment of the disclosure, on one hand, the data processing amount during each caching is reduced, and on the other hand, the data amount updated by each caching is obviously reduced, so that the time interval between two adjacent caches can be obviously shortened compared with the prior art, and the real-time performance of the cached data is improved. In this way, the influence of hysteresis caused by data processing performed on the computing engine 103 due to untimely data caching can be effectively alleviated, and therefore the accuracy and timeliness of business analysis can be greatly improved.
Fig. 3 schematically shows a flowchart of a method for acquiring data with changed data content in the first disk area 101 in operation S220 according to an embodiment of the present disclosure.
As shown in fig. 3, operation S220 may include operation S301 and operation S302 according to an embodiment of the present disclosure.
In operation S301, a data query operation performed on the first disk region 101 in the first period is monitored.
In operation S302, the one or more pieces of data are acquired based on the data query operation.
The data to be cached in the first disk area 101 is the data of the mySQL database. The specific implementation of the method flow of fig. 3 may be, for example, inserting a monitoring module into an interface where the computing engine 103 (e.g., spark) interfaces with mySQL, and monitoring operations on data in mySQL. And then acquiring whether data change exists and which data change occurs according to data operation instructions (such as update, delete and insert operation instructions) in the mySQL library.
Fig. 4 schematically shows a flowchart of a method for acquiring data with changed data content in the first disk area 101 in operation S220 according to another embodiment of the present disclosure.
As shown in fig. 4, operation S220 may include operations S401 to S403, according to an embodiment of the present disclosure.
In operation S401, a log recording operations performed on the first disk area 101 during the first period is acquired.
In operation S402, the log is analyzed to obtain operation information for changing the data content.
In operation S403, the one or more pieces of data are acquired based on the operation information.
The data of mySQL database is also taken as an example. For example, OPlog (Operation log) files in mySQL databases may be monitored for changes by spark. The OPlog file is a log of logged data operations in mySQL. By reading or monitoring the OPlog to find out if a specific operation instruction (such as update, delete, insert, etc.) exists, it can be determined whether there is a data change and which data has changed.
Fig. 5 is a flowchart schematically illustrating a method for acquiring data with changed data content in the first disk area 101 in operation S220 according to another embodiment of the present disclosure
As shown in fig. 5, operation S220 may include operation S501 and operation S502 according to an embodiment of the present disclosure.
In operation S501, time information of the latest content change operation performed on each piece of data in the first disk area 101 is recorded in real time.
In operation S502, all data of the time of the latest content changing operation within the first period are acquired based on the time information to obtain the one or more pieces of data.
The data of mySQL database is also taken as an example. For example, each piece of data in a data table that needs to be cached in mySQL may be time stamped. For example, a timestamp for recording data change in a column may be added to a data table of the mySQL database. And recording the current time after the data is changed once. For example, an added time stamp is recorded when data is newly added, an updated time stamp is also recorded when data is updated, and a deleted time stamp is recorded when data is deleted. Thus, assuming that the data in the mySQL database is cached once at 10:00:00, and is cached again after 5s (i.e. 10:00:05), the data with the timestamp between 10:00:00 and 10:00:05 can be obtained.
Fig. 6 schematically shows a flowchart of a method for re-caching partial data in operation S240 according to an embodiment of the present disclosure.
As shown in fig. 6, operation S240 may include operation S601 and operation S602, according to an embodiment of the present disclosure. Specifically, fig. 6 illustrates a flow of a method for performing cache update according to coarse-grained update.
In operation S601, a data partition or a data block to which the one or more pieces of data belong in the memory 102 is determined based on the cache information of the one or more pieces of data.
In operation S602, the data currently stored in the first disk region 101 and corresponding to the data partition or data block is entirely cached again.
Specifically, if the one or more pieces of data include the new data, the partition in which the new data is located is determined, for example, by hash calculation, and then all the data in the partition in which the new data is located may be cached and updated. Or if the one or more pieces of data include content-updated data or deleted data, performing cache update on all data in the corresponding data partition or data block according to cache information of the content-updated data or the deleted data.
According to another embodiment of the present disclosure, as described above, operation S240 may also be fine-grained updating, that is, each piece of the one or more pieces of data is re-cached according to the corresponding caching information. For example, when the one or more pieces of data include content update data of original data in the first disk area 101, the original data cached in the memory 102 is replaced with the content update data. Alternatively, for example, when the piece or pieces of data include newly added data that is newly added in the first disk area 101, the cache of the newly added data is added in the memory 102. Alternatively, for example, when the piece or pieces of data include original data of which contents are deleted from the first disk area 101, the original data cached in the memory 102 is deleted.
Fig. 7 schematically shows a block diagram of a data caching apparatus 700 according to an embodiment of the present disclosure.
As shown in fig. 7, according to the embodiment of the present disclosure, the data caching apparatus 700 includes a corresponding relationship obtaining module 710, a disk data change obtaining module 720, a cache change determining module 730, and a cache updating module 740. According to an embodiment of the present disclosure, the data caching apparatus 700 may be used to implement a data caching method according to an embodiment of the present disclosure.
The corresponding relationship obtaining module 710 is configured to obtain a storage corresponding relationship, where each piece of data in first cache data is cached from the first disk area 101 to the memory 102, where the first cache data is data cached from the first disk area 101 to the memory at a first time (operation S210).
The disk data change acquiring module 720 is configured to acquire one or more pieces of data with changed data contents in the first disk area 101 in a first period, where the first period lasts from the first time to a current time (operation S220).
The cache change determining module 730 is configured to determine cache information of the one or more pieces of data in the memory 102 based on the storage correspondence (operation S230).
The cache update module 740 is configured to re-cache the one or more pieces of data based on the cache information of the one or more pieces of data to update the first cache data.
According to an embodiment of the present disclosure, the disk data change acquiring module 720 is specifically configured to: monitoring a data query operation performed on the first disk region 101 during the first period (operation S301); and acquiring the one or more pieces of data based on the data query operation (operation S302).
Alternatively, according to the embodiment of the present disclosure, the service change information obtaining module 720 is specifically configured to: acquiring a log recording operations performed on the first disk area 101 during the first period (operation S401); analyzing the log to obtain operation information for changing the data content (operation S402); and acquiring the one or more pieces of data based on the operation information (operation S403).
Alternatively, according to an embodiment of the present disclosure, the disk data change acquiring module 720 is specifically configured to: recording time information of the latest content change operation performed on each piece of data in the first disk area 101 in real time (operation S501); and acquiring all data of the time of the latest content change operation within the first period based on the time information to obtain the one or more pieces of data (operation S502).
According to an embodiment of the present disclosure, the cache update module 740 is specifically configured to: based on the caching information of the one or more pieces of data, a data partition or a data partition to which the one or more pieces of data belong in the memory 102 is determined (operation S601), and the data corresponding to the data partition or the data partition currently stored in the first disk region 101 is cached again as a whole (operation S602).
Alternatively, according to an embodiment of the present disclosure, the cache update module 740 is specifically configured to re-cache each piece of the one or more pieces of data according to the corresponding cache information. According to an embodiment of the present disclosure, re-caching each piece of the one or more pieces of data according to the corresponding caching information includes: when the one or more pieces of data include content update data of original data in the first disk area 101, replacing the original data cached in the memory 102 with the content update data; when the one or more pieces of data include newly added data newly added in the first disk area 101, adding a cache of the newly added data in the memory 102; and when the one or more pieces of data include original data of the deleted content from the first disk area 101, deleting the original data cached in the memory 102.
According to the data caching device 700 of the embodiment of the present disclosure, on one hand, the data processing amount during each caching can be reduced, on the other hand, the time interval between two adjacent caches can be shortened, and the real-time performance of cache data synchronization is improved. Specifically, reference may be made to the descriptions of fig. 2 to fig. 6, and the present disclosure is not repeated.
Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.
For example, any plurality of the correspondence obtaining module 710, the disk data change obtaining module 720, the cache change determining module 730, and the cache updating module 740 may be combined and implemented in one module, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to the embodiment of the present disclosure, at least one of the correspondence obtaining module 710, the disk data change obtaining module 720, the buffer change determining module 730, and the buffer updating module 740 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementation manners of software, hardware, and firmware, or an appropriate combination of any several of them. Alternatively, at least one of the correspondence obtaining module 710, the disk data change obtaining module 720, the cache change determining module 730, and the cache updating module 740 may be at least partially implemented as a computer program module, and when the computer program module is executed, the corresponding function may be executed.
FIG. 8 schematically illustrates a block diagram of a computer system 800 suitable for implementing data caching, according to an embodiment of the present disclosure. The computer system 800 illustrated in FIG. 8 is only one example and should not impose any limitations on the scope of use or functionality of embodiments of the disclosure.
As shown in fig. 8, computer system 800 includes a processor 810, and a computer-readable storage medium 820. The computer system 800 may perform a data caching method according to an embodiment of the present disclosure.
In particular, processor 810 may include, for example, a general purpose microprocessor, an instruction set processor and/or related chip set and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), and/or the like. The processor 810 may also include on-board memory for caching purposes. Processor 810 may be a single processing unit or a plurality of processing units for performing different actions of a method flow according to embodiments of the disclosure.
Computer-readable storage medium 820, for example, may be a non-volatile computer-readable storage medium, specific examples including, but not limited to: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and so on.
The computer-readable storage medium 820 may include a computer program 821, which computer program 821 may include code/computer-executable instructions that, when executed by the processor 810, cause the processor 810 to perform a data caching method according to an embodiment of the present disclosure, or any variation thereof.
The computer program 821 may be configured with, for example, computer program code comprising computer program modules. For example, in an example embodiment, code in computer program 821 may include one or more program modules, including for example 821A, modules 821B, … …. It should be noted that the division manner and the number of the modules are not fixed, and those skilled in the art may use suitable program modules or program module combinations according to actual situations, and when the program modules are executed by the processor 810, the processor 810 may execute the data caching method according to the embodiment of the present disclosure or any variation thereof.
According to an embodiment of the present invention, at least one of the correspondence relation obtaining module 710, the disk data change obtaining module 720, the cache change determining module 730, and the cache updating module 740 may be implemented as a computer program module described with reference to fig. 8, which, when executed by the processor 810, may implement the corresponding operations described above.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement a data caching method according to an embodiment of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
While the disclosure has been shown and described with reference to certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents. Accordingly, the scope of the present disclosure should not be limited to the above-described embodiments, but should be defined not only by the appended claims, but also by equivalents thereof.
Claims (5)
1. A data caching method, comprising:
obtaining a storage correspondence relationship between each piece of data in first cache data cached from a first disk area to a memory, where the first cache data is data cached from the first disk area to the memory at a first time, and the method includes:
establishing a storage corresponding relation of each piece of data from the first disk area to the memory according to the original storage information of each piece of data in the first disk area and the cache information in the memory;
acquiring one or more pieces of data with changed data content in the first disk area in a first period, wherein the first period lasts from the first moment to the current time;
determining cache information of the one or more pieces of data in the memory based on the storage corresponding relation; and
re-caching the one or more pieces of data based on the caching information of the one or more pieces of data to update the first cached data, including:
determining a data partition or a data block of the one or more pieces of data in the memory based on the cache information of the one or more pieces of data; and
and integrally caching the data which is currently stored in the first disk area and corresponds to the data partition or the data block.
2. The data caching method of claim 1, wherein obtaining one or more pieces of data whose data content in the first volume area changes during a first period comprises:
monitoring data query operation executed on the first disk area in the first period; and
and acquiring the one or more pieces of data based on the data query operation.
3. The data caching method of claim 1, wherein obtaining one or more pieces of data whose data content in the first volume area changes during a first period comprises:
acquiring a record log recording operations executed on the first disk area in the first period;
analyzing the recorded log to acquire operation information for changing the data content; and
and acquiring the one or more pieces of data based on the operation information.
4. The data caching method of claim 1, wherein the obtaining one or more pieces of data whose data content changes in the first volume area during a first period comprises:
recording time information of the latest content change operation executed on each piece of data in the first disk area in real time;
and acquiring all data of the latest time of the content change operation in the first period based on the time information to obtain the one or more pieces of data.
5. A data caching apparatus, comprising:
a correspondence obtaining module, configured to obtain a storage correspondence between each piece of data in first cache data cached from a first disk area to a memory, where the first cache data is data cached from the first disk area to the memory at a first time, and the method includes:
establishing a storage corresponding relation of each piece of data from the first disk area to the memory according to the original storage information of each piece of data in the first disk area and the cache information in the memory;
a disk data change acquiring module, configured to acquire one or more pieces of data in which data content in the first disk area changes within a first period, where the first period is a period lasting from the first time to a current time;
a cache change determining module, configured to determine cache information of the one or more pieces of data in the memory based on the storage correspondence; and
the cache updating module is configured to re-cache the one or more pieces of data based on the cache information of the one or more pieces of data to update the first cache data, and includes:
determining a data partition or a data block of the one or more pieces of data in the memory based on the cache information of the one or more pieces of data; and
and integrally caching the data which is currently stored in the first disk area and corresponds to the data partition or the data block.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910226551.2A CN109981774B (en) | 2019-03-22 | 2019-03-22 | Data caching method and data caching device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910226551.2A CN109981774B (en) | 2019-03-22 | 2019-03-22 | Data caching method and data caching device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109981774A CN109981774A (en) | 2019-07-05 |
CN109981774B true CN109981774B (en) | 2021-02-19 |
Family
ID=67080320
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910226551.2A Active CN109981774B (en) | 2019-03-22 | 2019-03-22 | Data caching method and data caching device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109981774B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111611287B (en) * | 2020-06-17 | 2023-10-03 | 北京商越网络科技有限公司 | Cache data updating method |
CN113626458A (en) * | 2021-08-19 | 2021-11-09 | 咪咕数字传媒有限公司 | High-concurrency data updating method, device, equipment and computer storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103853727A (en) * | 2012-11-29 | 2014-06-11 | 深圳中兴力维技术有限公司 | Method and system for improving large data volume query performance |
CN105701190A (en) * | 2016-01-07 | 2016-06-22 | 深圳市金证科技股份有限公司 | Data synchronizing method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105243109B (en) * | 2015-09-25 | 2021-10-15 | 华为技术有限公司 | Data backup method and data processing system |
-
2019
- 2019-03-22 CN CN201910226551.2A patent/CN109981774B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103853727A (en) * | 2012-11-29 | 2014-06-11 | 深圳中兴力维技术有限公司 | Method and system for improving large data volume query performance |
CN105701190A (en) * | 2016-01-07 | 2016-06-22 | 深圳市金证科技股份有限公司 | Data synchronizing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN109981774A (en) | 2019-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10180946B2 (en) | Consistent execution of partial queries in hybrid DBMS | |
US20170060769A1 (en) | Systems, devices and methods for generating locality-indicative data representations of data streams, and compressions thereof | |
CN113111129B (en) | Data synchronization method, device, equipment and storage medium | |
US10275481B2 (en) | Updating of in-memory synopsis metadata for inserts in database table | |
US10885029B2 (en) | Parallel execution of merge operations | |
US9063980B2 (en) | Log consolidation | |
US11423008B2 (en) | Generating a data lineage record to facilitate source system and destination system mapping | |
CN109981774B (en) | Data caching method and data caching device | |
US10884998B2 (en) | Method for migrating data records from a source database to a target database | |
US10423580B2 (en) | Storage and compression of an aggregation file | |
US20150220413A1 (en) | Extracting Log Files From Storage Devices | |
US11853284B2 (en) | In-place updates with concurrent reads in a decomposed state | |
US11620270B2 (en) | Representing and managing sampled data in storage systems | |
CN112783711A (en) | Method and storage medium for analyzing program memory on NodeJS | |
US10664464B2 (en) | Self-maintaining effective value range synopsis in presence of deletes in analytical databases | |
CN113032349A (en) | Data storage method and device, electronic equipment and computer readable medium | |
US11341150B1 (en) | Organizing time-series data for query acceleration and storage optimization | |
CN112765170B (en) | Embedded time sequence data management method and device | |
US10783125B2 (en) | Automatic data purging in a database management system | |
CN115576722A (en) | Disk IO (input/output) query method and device | |
CN104317820A (en) | Statistical method and device of report | |
CN114911790A (en) | Data synchronization method and device, electronic equipment and storage medium | |
CN111782588A (en) | File reading method, device, equipment and medium | |
US10963348B1 (en) | Summary change log indexed by inode numbers | |
US11609834B2 (en) | Event based aggregation for distributed scale-out storage systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |