CN114780043A - Data processing method and device based on multilayer cache and electronic equipment - Google Patents

Data processing method and device based on multilayer cache and electronic equipment Download PDF

Info

Publication number
CN114780043A
CN114780043A CN202210497433.7A CN202210497433A CN114780043A CN 114780043 A CN114780043 A CN 114780043A CN 202210497433 A CN202210497433 A CN 202210497433A CN 114780043 A CN114780043 A CN 114780043A
Authority
CN
China
Prior art keywords
cache
data
layer
identification code
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210497433.7A
Other languages
Chinese (zh)
Inventor
贺素馨
池信泽
张旭明
王豪迈
胥昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xingchen Tianhe Technology Co ltd
Original Assignee
Beijing Xingchen Tianhe Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xingchen Tianhe Technology Co ltd filed Critical Beijing Xingchen Tianhe Technology Co ltd
Priority to CN202210497433.7A priority Critical patent/CN114780043A/en
Publication of CN114780043A publication Critical patent/CN114780043A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and device based on multilayer cache and electronic equipment. Wherein, the method comprises the following steps: receiving a data writing request sent by a client, wherein the data writing request at least carries: target data to be written; responding to the data writing request, and generating a cache object corresponding to the target data, a data object corresponding to the cache object and a unique identification code, wherein the cache object and the data object respectively correspond to object metadata; migrating the cache object into a cache layer, and establishing a first mapping relation between object metadata of the cache object and the unique identification code; and storing the data object into the data layer, and establishing a second mapping relation between the object metadata of the data object and the unique identification code. The invention solves the technical problem of complex transactional processing mode when a layered cache architecture is adopted for data caching in the related technology.

Description

Data processing method and device based on multilayer cache and electronic equipment
Technical Field
The invention relates to the technical field of data processing, in particular to a data processing method and device based on multilayer cache and electronic equipment.
Background
In the related art, the snapshot has become an indispensable common solution in a storage or application soft disaster recovery scheme, and is a basis for copying and backup of other systems, and has become a standard configuration characteristic of storage. Common snapshot implementation principles include cow (copy On write) and ROW (redirect On write), wherein the ROW is implemented by Rename of a snapshot, and creating new metadata and establishing a new mapping relationship when a snapshot is generated each time. The method is easy to implement in the local cache, the metadata information of all objects (objects) is recorded in the cache, and the cache and the data disk are usually in the same node (namely, the redundancy of the centralized cache, the cache and the data disk are the same, and the cache and the data disk are bound with each other). When the object metadata information in the high-speed medium is updated during the Rename snapshot, the generated snapshot needs to point to the source data object, and a series of data updating is performed.
Compared with centralized caching, the distributed multi-layer caching develops rapidly in recent years, the limitation that cache high-speed media and data low-speed disks must be bound is completely broken, more stable caching performance is provided, the caching can be distributed on a plurality of independent servers, the servers can share with a data layer, different redundancy strategies can be used for the caching, for example, the caching server is a 3-copy server, the data server is a 2-copy server or the like, and the requirement of a large cluster large data center in distributed storage can be better met through hierarchical caching.
However, the distributed hierarchical cache architecture in the related art has several problems as follows:
1. when the rename is snapshot, a rename command needs to be sent to multiple layers (a cache layer and a data layer) at the same time, a very complex process is required to ensure the transactional performance of the rename in a cache layer and the data layer, and the current data layer is generally formed by a low-speed disk, the rename is slow, so that the overall delay of the rename is very high;
2. the deletion process, similar to rename, is very difficult to meet the performance and transactional requirements.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a data processing method and device based on multilayer cache and electronic equipment, which at least solve the technical problem of complex transactional processing mode when a hierarchical cache architecture is adopted for data caching in the related technology.
According to an aspect of the embodiments of the present invention, a data processing method based on a multi-layer cache is provided, and is applied to a placing group PG, where the placing group PG is respectively butted with a cache layer and a data layer, and includes: receiving a data writing request sent by a client, wherein the data writing request at least carries: target data to be written; responding to the data writing request, and generating a cache object corresponding to the target data, and a data object and a unique identification code corresponding to the cache object, wherein the cache object and the data object respectively correspond to object metadata; migrating the cache object into the cache layer, and establishing a first mapping relation between object metadata of the cache object and the unique identification code; and storing the data object into a data layer, and establishing a second mapping relation between the object metadata of the data object and the unique identification code.
Optionally, after establishing the first mapping relationship between the object metadata of the cache object and the unique identification code, the method further includes: and under the condition that the object metadata in the cache layer is updated, adjusting a first mapping relation between the object metadata in the cache layer and the unique identification code.
Optionally, after establishing the second mapping relationship between the data object and the unique identification code, the method further includes: under the condition that the cache layer triggers a preset flushing strategy, the cache object is flushed from the cache layer to the data object of the data layer; deleting the cache objects in the cache layer and keeping object metadata of the cache objects; and pointing the object metadata of the cache objects in the cache layer to the data objects in the data layer through the unique identification code.
Optionally, after the object metadata of the cache object in the cache layer is pointed to the data object in the data layer by the unique identification code, the method further includes: releasing the deleted cache object as a read cache under the condition that the cache space in the cache layer is determined to meet the space vacancy requirement; and releasing the cache space corresponding to the cache object under the condition that the cache space is determined not to meet the space idle requirement.
Optionally, after releasing the cache space corresponding to the cache object, the method further includes: under the condition of receiving a data reading request, searching object metadata of the data object through the object metadata of the cache object; querying the data object based on the found object metadata of the data object; and caching the data object into the cache layer.
Optionally, after generating a cache object corresponding to the target data, and a data object and a unique identification code corresponding to the cache object, the method further includes: storing the cache objects to the cache layer and the data layer respectively; under the condition that an object deleting instruction is received, deleting the cache object of the cache layer, and updating object metadata of the cache object; receiving a recovery and brushing-down instruction initiated by a background thread of the cache layer; responding to a recovery brushing instruction, and sending an object deleting instruction to the data layer to delete the cache object in the data layer and update the object metadata of the cache object in the data layer; and releasing the cache space of the cache object of the cache layer and the cache space of the cache object in the data layer.
Optionally, the step of generating, in response to the data write request, a cache object corresponding to the target data, and a data object and a unique identification code corresponding to the cache object includes: responding to the data writing request, and creating a cache object corresponding to the target data; and allocating the unique identification code for the cache object by adopting an identification allocator.
According to another aspect of the embodiments of the present invention, there is provided a data processing apparatus based on a multi-layer cache, which is applied to a placing group PG, where the placing group PG is a group in which a cache layer and a data layer are respectively butted, and the data processing apparatus includes: a receiving unit, configured to receive a data write request sent by a client, where the data write request at least carries: target data to be written; a response unit, configured to generate, in response to the data write request, a cache object corresponding to the target data, and a data object and a unique identification code corresponding to the cache object, where the cache object and the data object respectively correspond to object metadata; the first establishing unit is used for migrating the cache object into the cache layer and establishing a first mapping relation between object metadata of the cache object and the unique identification code; and the second establishing unit is used for storing the data object into a data layer and establishing a second mapping relation between the object metadata of the data object and the unique identification code.
Optionally, the data processing apparatus based on multi-layer cache further includes: and the first adjusting unit is used for adjusting the first mapping relation between the object metadata in the cache layer and the unique identification code when the object metadata in the cache layer is updated after the first mapping relation between the object metadata of the cache object and the unique identification code is established.
Optionally, the data processing apparatus based on multi-layer cache further includes: the data object brushing unit is used for brushing the cache object from the cache layer to the data object of the data layer under the condition that the cache layer triggers a preset brushing strategy after the second mapping relation between the data object and the unique identification code is established; the first deleting unit is used for deleting the cache objects in the cache layer and keeping the object metadata of the cache objects; and the pointing unit is used for pointing the object metadata of the cache objects in the cache layer to the data objects in the data layer through the unique identification code.
Optionally, the data processing apparatus based on multi-layer cache further includes: a first releasing unit, configured to release the deleted cache object as a read cache when it is determined that a cache space in the cache layer meets a space vacancy requirement after the object metadata of the cache object in the cache layer points to the data object in the data layer through the unique identification code; and the second releasing unit is used for releasing the cache space corresponding to the cache object under the condition that the cache space does not meet the space idle requirement.
Optionally, the data processing apparatus based on a multi-layer cache further includes: the first searching unit is used for searching the object metadata of the data object through the object metadata of the cache object under the condition of receiving a data reading request after the cache space corresponding to the cache object is released; a second searching unit, configured to query the data object based on the searched object metadata of the data object; and the cache unit is used for caching the data object into the cache layer.
Optionally, the data processing apparatus based on multi-layer cache further includes: the first storage unit is used for storing the cache objects to the cache layer and the data layer after generating the cache objects corresponding to the target data and the data objects and the unique identification codes corresponding to the cache objects; a second deleting unit, configured to delete the cache object of the cache layer and update object metadata of the cache object when an object deletion instruction is received; the first receiving module is used for receiving a recovery and brushing-down instruction initiated by a background thread of the cache layer; the first response module is used for responding to a recovery and brushing-down instruction, sending an object deleting instruction to the data layer so as to delete the cache object in the data layer and update the object metadata of the cache object in the data layer; a first releasing module, configured to release the cache space of the cache object in the cache layer and the cache space of the cache object in the data layer.
Optionally, the response unit includes: the second response module is used for responding to the data writing request and creating a cache object corresponding to the target data; and the distribution module is used for distributing the unique identification code for the cache object by adopting an identification distributor.
According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute any one of the above-mentioned multi-level cache-based data processing methods via execution of the executable instructions.
According to another aspect of the embodiments of the present invention, a computer-readable storage medium is further provided, where the computer-readable storage medium includes a stored computer program, and when the computer program runs, a device where the computer-readable storage medium is located is controlled to execute any one of the above-mentioned data processing methods based on multi-layer cache.
According to the method and the device, after a data write-in request sent by a client is received, the data write-in request is responded, a cache object corresponding to target data, a data object corresponding to the cache object and a unique identification code are generated, wherein the cache object and the data object are respectively corresponding to object metadata, the cache object is migrated into a cache layer, a first mapping relation between the object metadata of the cache object and the unique identification code is established, the data object is stored into the data layer, and a second mapping relation between the object metadata of the data object and the unique identification code is established. In this embodiment, the unique identification code is used to associate the object mapping relationship between the data layer and the cache layer, so as to reduce the complexity of the transactional processing manner, ensure the corresponding relationship between the cache object and the data object, and improve the data query efficiency and the cache efficiency, thereby solving the technical problem that the transactional processing manner is complex when a hierarchical cache architecture is used for data caching in the related art.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention and do not constitute a limitation of the invention. In the drawings:
FIG. 1 is a flow chart of an alternative multi-level cache based data processing method according to an embodiment of the present invention;
FIG. 2 is a diagram of an optional hierarchically cached object mapping relationship, according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an optional hierarchical cache for data scrubbing according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an alternative multi-level cache based data processing apparatus according to an embodiment of the present invention;
fig. 5 is a block diagram of a hardware structure of an electronic device (or a mobile device) according to an embodiment of the present invention, the electronic device being based on a multi-layer cache data processing method.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
To facilitate understanding of the present invention by those skilled in the art, the following explanation is made for some terms or nouns appearing in the embodiments of the present invention:
the Placement Group, PG for short, implements policy isolation between different storage pools, and is used to organize and map the object storage.
Object, minimal data storage unit for cluster management.
The Data layer, also called Data Pool, is a Data layer in a storage hierarchy, and is composed of relatively low-speed storage media such as a mechanical disk HDD and a common solid state disk SSD of each node.
The Cache layer is also called as Cache OSD, Cache, and the Cache layer in the storage hierarchical structure is composed of high-speed storage media such as SSD, NVMe of each node and the like.
A data volume, volume for short, and a logical storage space;
the bucket, buckut, may also be called a cache bucket.
The invention can be applied to various data storage systems/cache devices, and the data storage systems or the cache devices comprise: the method comprises the steps that a Client (Client), a Cache layer (or a Cache Pool), a Data layer (or a Data Pool) can realize Data snapshot backup, Data down-flushing processing, Data elimination and the like through a Data storage system/Cache device, the Data snapshot processing is mainly realized in a ROW mode, a snapshot is generated through a Rename of the snapshot every time, new metadata are created and a new mapping relation is established, and when the Rename is snapshot and metadata information of an object in a high-speed medium is updated, the generated snapshot needs to point to a source Data object, and a series of Data updating is carried out.
In the invention, an object mapping method and a processing strategy among multiple layers of caches are introduced, the method comprises the steps of establishing and managing a mapping relation, the uniqueness of mapping, data refreshing and space recovery by utilizing the mapping relation, and the like, so that the problem of updating of snapshot rename or other metadata in a hierarchical structure can be solved, the realization scheme is simpler under the condition of ensuring the uniqueness, and the transaction processing delay is obviously reduced.
The method can provide support conditions for asynchronous data processing of the background among multiple layers, avoids performance reduction caused by a synchronous processing mode, and can effectively solve the problem of data consistency possibly caused by various abnormal switching in the multi-layer linkage process.
The following embodiments of the present invention may implement a scheme of object mapping, allocation, and deletion mechanism for a hierarchical cache. The following describes embodiments of the present invention.
Example one
In accordance with an embodiment of the present invention, there is provided a multi-level cache based data processing method embodiment, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than that described herein.
The embodiment of the invention provides a data processing method based on multilayer cache, which is applied to a placing group PG, wherein the placing group PG is respectively butted with a cache layer and a data layer. The PG in this embodiment is responsible for receiving and processing requests from clients; it is the responsibility down to translate these data requests into transactions that can be understood by the local object store.
Fig. 1 is a flowchart of an alternative data processing method based on a multi-level cache according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step S102, receiving a data writing request sent by a client, wherein the data writing request at least carries: target data to be written.
In the embodiment of the present invention, the types of the client include, but are not limited to: PC, notebook, mobile terminal, panel, etc. The client may refer to a service layer or a request initiation layer, and when data needs to be stored or fetched, a request needs to be sent to a cache layer.
And step S104, responding to the data writing request, and generating a cache object corresponding to the target data, a data object corresponding to the cache object and a unique identification code, wherein the cache object and the data object respectively correspond to object metadata.
Optionally, the step of generating, in response to the data write request, a cache object corresponding to the target data, and a data object and a unique identification code corresponding to the cache object includes: responding to the data writing request, and creating a cache object corresponding to the target data; and allocating a unique identification code for the cache object by adopting an identification allocator.
The unique identification code in this embodiment may be a UUID, the number of bits of the unique identification code is set according to different devices, when data IO is written for the first time, a cache object a may be newly generated, and simultaneously, one data object a is generated and may be persistently stored. In this embodiment, each time an object is generated, a UUID (universal unique identifier) is applied for the object, and natural uniqueness of a is ensured, where a namespace range of the UUID is placed in the group PG, and the UUID corresponds to the object in the data layer.
Fig. 2 is an object mapping relationship diagram of an optional hierarchical cache according to an embodiment of the present invention, as shown in fig. 2, including: a store cache tier (pointing to a), a 64-bit UUID map, and a store data tier (pointing to a).
And associating the object mapping relation in the data layer and the cache layer through a UUID to realize a transactional processing mode. When an object is newly added, a branch for identifying an allocator is entered, and allocation of the UUIDs takes PG as a namespace, so that the error probability is reduced.
Optionally, in this embodiment, the snapshot is generated each time by creating new metadata and establishing a new mapping relationship, and when the object metadata information in the high-speed medium is updated in the snapshot Rename, the generated snapshot needs to point to the source data object, and a series of data updates will be performed.
Step S106, the cache object is migrated into the cache layer, and a first mapping relation between the object metadata of the cache object and the unique identification code is established.
And step S108, storing the data object into the data layer, and establishing a second mapping relation between the object metadata of the data object and the unique identification code.
Through the steps, after a data write-in request sent by a client is received, the data write-in request is responded, a cache object corresponding to target data, a data object corresponding to the cache object and a unique identification code are generated, wherein the cache object and the data object respectively correspond to object metadata, the cache object is migrated into a cache layer, a first mapping relation between the object metadata of the cache object and the unique identification code is established, the data object is stored into the data layer, and a second mapping relation between the object metadata of the data object and the unique identification code is established. In the embodiment, the unique identification code is adopted to associate the object mapping relationship between the data layer and the cache layer, so that the complexity of the transactional processing mode is reduced, the corresponding relationship between the cache object and the data object is ensured, the data query efficiency and the cache efficiency can be improved, and the technical problem that the transactional processing mode is complex when a layered cache architecture is adopted for data caching in the related technology is solved.
The following describes embodiments of the present invention in detail with reference to the above-described respective implementation steps.
As an optional implementation manner of this embodiment, after establishing the first mapping relationship between the object metadata of the cache object and the unique identification code, the method further includes: and under the condition that the object metadata in the cache layer is updated, adjusting a first mapping relation between the object metadata in the cache layer and the unique identification code.
When the object metadata of the cache layer is updated, for example, when a snapshot policy is triggered to produce an ROW snapshot, the object metadata in the cache layer is modified, and at this time, the mapping relationship between the object metadata and the unique identification code needs to be modified synchronously, so as to ensure the corresponding relationship between the metadata object and the object in the data layer.
In the embodiment, after the object metadata in the cache layer is updated, the mapping relation between the object metadata and the unique identification code is modified, so that the corresponding relation between the cache object and the data object can be ensured, when the object metadata cannot be acquired in the cache, the object metadata can be quickly and accurately searched in the data layer, the efficiency is improved, and the relevance can be ensured after the hot data is promoted.
Optionally, after establishing the second mapping relationship between the data object and the unique identification code, the method further includes: under the condition that a preset flushing strategy is triggered by a cache layer, a data object of a data layer is flushed from the cache layer to a cache object; deleting the cache objects in the cache layer and keeping the object metadata of the cache objects; and pointing the object metadata of the cache objects in the cache layer to the data objects in the data layer through the unique identification codes.
The Cache layer/Cache layer deletes the Cache objects in the Cache layer and retains the metadata of the objects after caching the metadata of all the Cache objects, including all the data objects, when the Cache layer triggers a certain policy, such as a water level policy and an idle data-flushing policy.
Fig. 3 is a schematic diagram of performing data flushing in an optional hierarchical cache according to an embodiment of the present invention, as shown in fig. 3, when a cache layer triggers a certain policy, such as a watermark policy and an idle data flushing policy, a data object a (object a) is flushed/flushed from the cache layer to the data object a (object a) in the data layer; the cache layer data object A is deleted, but the object metadata A is stored all the time; object metadata a in the cache tier points to data object a in the data tier by UUID.
Alternatively, after the object metadata of the cache object in the cache layer is pointed to the data object in the data layer by the unique identification code, the method further includes: releasing the deleted cache object as a read cache under the condition that the cache space in the cache layer is determined to meet the space vacancy requirement; and releasing the cache space corresponding to the cache object under the condition that the cache space does not meet the space idle requirement.
In this embodiment, after releasing the cache space corresponding to the cache object, the method further includes: under the condition of receiving a data reading request, searching object metadata of a data object through object metadata of a cache object; querying the data object based on the object metadata of the searched data object; and caching the data object into a cache layer.
Referring to fig. 3, the deleted object a is released as a read cache, and if there is data read, the deleted object a may be returned from the object a, and when the cache space is insufficient, the cache spaces with lower hot spots are released, and after the space is released, if the data needs to be read again, the object metadata a needs to be found through the object metadata a, and then the object a is found, and the object a is promoted to the cache layer and accessed by the front-end application.
Optionally, after generating the cache object corresponding to the target data, and the data object and the unique identification code corresponding to the cache object, the method further includes: storing the cache objects to a cache layer and a data layer respectively; under the condition that an object deleting instruction is received, deleting a cache object of a cache layer, and updating object metadata of the cache object; receiving a recovery and brushing-down instruction initiated by a background thread of a cache layer; responding to the recovery of the down-brushing instruction, sending an object deleting instruction to the data layer to delete the cache object in the data layer and update the object metadata of the cache object in the data layer; and releasing the cache space of the cache object in the cache layer and the cache space of the cache object in the data layer.
The number of objects (cache objects) in the data layer determines the available space in the whole pool, the cache layer being an intermediate process. In the cache layer or the data layer, under some special conditions, the data object may exist in both the cache layer and the data layer, and in this case, in the process of deleting the data object, the cache objects in the cache layer and the data layer need to be cleared at the same time.
The cache layer caches all object metadata, and after receiving a command of deleting a cache object, the cache object in the cache layer is deleted (if the cache object exists), the object metadata (Meta data) is updated, and the deletion operation of the cache object in the cache layer is completed; and (3) the cache OSD background thread initiates a flush recovery operation (if needed), a deleting command is sent to the data layer OSD, the OSD waits for the return-to-the-station processing, a corresponding object is deleted, Meta data is updated, the operation on the data layer is completed, and meanwhile, all the space is really released.
In this embodiment, the above policy requires all uniqueness of the data object a corresponding to the cache object a, and if a is deleted and created again, the corresponding a should be regenerated.
In this embodiment, under an abnormal condition, the data object may exist in the cache layer and the data layer at the same time, and then in the flow of deleting the data object under this condition, the object in the cache layer and the data layer needs to be cleared at the same time, and the recovery mechanism can recover the cache space well, thereby avoiding space occupation.
The invention is described below in connection with an alternative embodiment.
Example two
The embodiment of the invention provides a data processing device based on multilayer cache, which is applied to a placing group PG, wherein the placing group PG is respectively butted with a cache layer and a data layer. The multi-level cache based data processing apparatus includes a plurality of implementation units corresponding to the implementation steps in the first embodiment.
Fig. 4 is a schematic diagram of an alternative data processing apparatus based on multi-layer cache according to an embodiment of the present invention, as shown in fig. 4, including: a receiving unit 41, a response unit 43, a first establishing unit 45, a second establishing unit 47, wherein,
a receiving unit 41, configured to receive a data write request sent by a client, where the data write request at least carries: target data to be written;
a response unit 43, configured to generate, in response to the data write request, a cache object corresponding to the target data, and a data object and a unique identification code corresponding to the cache object, where the cache object and the data object respectively correspond to object metadata;
the first establishing unit 45 is configured to migrate the cache object into the cache layer, and establish a first mapping relationship between object metadata of the cache object and the unique identification code;
the second establishing unit 47 is configured to store the data object into the data layer, and establish a second mapping relationship between the object metadata of the data object and the unique identifier.
After receiving a data write request sent by a client, the receiving unit 41 may respond to the data write request by the responding unit 43 to generate a cache object corresponding to target data, and a data object and a unique identification code corresponding to the cache object, where the cache object and the data object respectively correspond to object metadata, the first establishing unit 45 may migrate the cache object into the cache layer and establish a first mapping relationship between the object metadata of the cache object and the unique identification code, and the second establishing unit 47 may store the data object into the data layer and establish a second mapping relationship between the object metadata of the data object and the unique identification code. In the embodiment, the unique identification code is adopted to associate the object mapping relationship between the data layer and the cache layer, so that the complexity of the transactional processing mode is reduced, the corresponding relationship between the cache object and the data object is ensured, the data query efficiency and the cache efficiency can be improved, and the technical problem that the transactional processing mode is complex when a layered cache architecture is adopted for data caching in the related technology is solved.
Optionally, the data processing apparatus based on a multi-layer cache further includes: the first adjusting unit is used for adjusting the first mapping relation between the object metadata in the cache layer and the unique identification code when the object metadata in the cache layer is updated after the first mapping relation between the object metadata of the cache object and the unique identification code is established.
Optionally, the data processing apparatus based on multi-layer cache further includes: the data object brushing unit is used for brushing the cache object from the cache layer to the data object of the data layer under the condition that the cache layer triggers a preset brushing strategy after a second mapping relation between the data object and the unique identification code is established; the first deleting unit is used for deleting the cache objects in the cache layer and keeping the object metadata of the cache objects; and the pointing unit is used for pointing the object metadata of the cache object in the cache layer to the data object in the data layer through the unique identification code.
Optionally, the data processing apparatus based on a multi-layer cache further includes: the first releasing unit is used for releasing the deleted cache object into a read cache under the condition that the cache space in the cache layer is determined to meet the space idle requirement after the object metadata of the cache object in the cache layer points to the data object in the data layer through the unique identification code; and the second releasing unit is used for releasing the cache space corresponding to the cache object under the condition that the cache space does not meet the space idle requirement.
Optionally, the data processing apparatus based on a multi-layer cache further includes: the first searching unit is used for searching the object metadata of the data object through the object metadata of the cache object under the condition of receiving a data reading request after releasing the cache space corresponding to the cache object; a second search unit for searching the data object based on the object metadata of the searched data object; and the cache unit is used for caching the data object into the cache layer.
Optionally, the data processing apparatus based on a multi-layer cache further includes: the first storage unit is used for respectively storing the cache objects to the cache layer and the data layer after generating the cache objects corresponding to the target data and the data objects and the unique identification codes corresponding to the cache objects; the second deleting unit is used for deleting the cache object of the cache layer and updating the object metadata of the cache object under the condition of receiving the object deleting instruction; the first receiving module is used for receiving a recovery and brushing-down instruction initiated by a background thread of the cache layer; the first response module is used for responding to the recovery and brushing-down instruction, sending an object deleting instruction to the data layer so as to delete the cache object in the data layer and update the object metadata of the cache object in the data layer; the first releasing module is used for releasing the cache space of the cache object in the cache layer and the cache space of the cache object in the data layer.
Optionally, the response unit includes: the second response module is used for responding to the data writing request and creating a cache object corresponding to the target data; and the allocation module is used for allocating the unique identification code for the cache object by adopting the identification allocator.
The data processing apparatus based on multi-layer cache may further include a processor and a memory, where the receiving unit 41, the responding unit 43, the first establishing unit 45, the second establishing unit 47, and the like are all stored in the memory as program units, and the processor executes the program units stored in the memory to implement corresponding functions.
The processor comprises a kernel, and the kernel calls a corresponding program unit from the memory. The kernel can set one or more than one kernel, responds to the data writing request by adjusting kernel parameters, generates a cache object corresponding to the target data, a data object corresponding to the cache object and a unique identification code, migrates the cache object into a cache layer, establishes a first mapping relation between object metadata of the cache object and the unique identification code, stores the data object into the data layer, and establishes a second mapping relation between the object metadata of the data object and the unique identification code.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including: a processor; and a memory for storing executable instructions for the processor; wherein the processor is configured to perform any one of the above-described multi-level cache based data processing methods via execution of executable instructions.
According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, and when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to execute any one of the above-mentioned data processing methods based on multi-level cache.
The present application also provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device: receiving a data writing request sent by a client, wherein the data writing request at least carries: target data to be written; responding to the data writing request, and generating a cache object corresponding to the target data, a data object corresponding to the cache object and a unique identification code, wherein the cache object and the data object are respectively corresponding to object metadata; migrating the cache object into a cache layer, and establishing a first mapping relation between object metadata of the cache object and the unique identification code; and storing the data object into the data layer, and establishing a second mapping relation between the object metadata of the data object and the unique identification code.
Fig. 5 is a block diagram of a hardware structure of an electronic device (or a mobile device) according to an embodiment of the present invention, which is based on a multi-level cache data processing method. As shown in fig. 5, the electronic device may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), and memory 104 for storing data. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a keyboard, a power supply, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 5 is only an illustration and is not intended to limit the structure of the electronic device. For example, the electronic device may also include more or fewer components than shown in FIG. 5, or have a different configuration than shown in FIG. 5.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described in detail in a certain embodiment.
In the embodiments provided in the present application, it should be understood that the disclosed technical content can be implemented in other manners. The above-described apparatus embodiments are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or may not be executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention, which is substantially or partly contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and amendments can be made without departing from the principle of the present invention, and these modifications and amendments should also be considered as the protection scope of the present invention.

Claims (10)

1. A data processing method based on multilayer cache is characterized in that the method is applied to a placing group PG, the placing group PG is respectively butted with a cache layer and a data layer, and the method comprises the following steps:
receiving a data writing request sent by a client, wherein the data writing request at least carries: target data to be written;
responding to the data writing request, and generating a cache object corresponding to the target data, and a data object and a unique identification code corresponding to the cache object, wherein the cache object and the data object respectively correspond to object metadata;
migrating the cache object into the cache layer, and establishing a first mapping relation between object metadata of the cache object and the unique identification code;
and storing the data object into a data layer, and establishing a second mapping relation between the object metadata of the data object and the unique identification code.
2. The data processing method according to claim 1, further comprising, after establishing the first mapping relationship between the object metadata of the cache object and the unique identification code:
and under the condition that the object metadata in the cache layer is updated, adjusting a first mapping relation between the object metadata in the cache layer and the unique identification code.
3. The data processing method of claim 1, further comprising, after establishing the second mapping relationship between the data object and the unique identification code:
under the condition that the cache layer triggers a preset flushing strategy, the cache object is flushed from the cache layer to the data object of the data layer;
deleting the cache objects in the cache layer and keeping object metadata of the cache objects;
and pointing the object metadata of the cache objects in the cache layer to the data objects in the data layer through the unique identification code.
4. The data processing method of claim 3, further comprising, after pointing object metadata of the cache object in the cache layer to the data object in the data layer via the unique identification code:
releasing the deleted cache object as a read cache under the condition that the cache space in the cache layer is determined to meet the space vacancy requirement;
and releasing the cache space corresponding to the cache object under the condition that the cache space is determined not to meet the space idle requirement.
5. The data processing method according to claim 4, further comprising, after releasing the cache space corresponding to the cache object:
under the condition of receiving a data reading request, searching object metadata of the data object through the object metadata of the cache object;
querying the data object based on the found object metadata of the data object;
and caching the data object into the cache layer.
6. The data processing method of claim 1, further comprising, after generating a cache object corresponding to the target data and a data object and a unique identification code corresponding to the cache object:
storing the cache objects to the cache layer and the data layer respectively;
under the condition that an object deleting instruction is received, deleting the cache object of the cache layer, and updating object metadata of the cache object;
receiving a recovery and brushing-down instruction initiated by a background thread of the cache layer;
responding to the recovery and down-brushing instruction, sending an object deleting instruction to the data layer to delete the cache object in the data layer and update the object metadata of the cache object in the data layer;
and releasing the cache space of the cache object of the cache layer and the cache space of the cache object in the data layer.
7. The data processing method according to claim 1, wherein the step of generating a cache object corresponding to the target data and a data object and a unique identification code corresponding to the cache object in response to the data write request comprises:
responding to the data writing request, and creating a cache object corresponding to the target data;
and allocating the unique identification code for the cache object by adopting an identification allocator.
8. The utility model provides a data processing apparatus based on multilayer buffer memory which characterized in that is applied to and places group PG, it has buffer memory layer and data layer respectively to place the group to dock, includes:
a receiving unit, configured to receive a data write request sent by a client, where the data write request at least carries: target data to be written;
a response unit, configured to generate, in response to the data write request, a cache object corresponding to the target data, and a data object and a unique identification code corresponding to the cache object, where the cache object and the data object respectively correspond to object metadata;
the first establishing unit is used for migrating the cache object into the cache layer and establishing a first mapping relation between object metadata of the cache object and the unique identification code;
and the second establishing unit is used for storing the data object into a data layer and establishing a second mapping relation between the object metadata of the data object and the unique identification code.
9. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the multi-level cache based data processing method of any one of claims 1 to 7 via execution of the executable instructions.
10. A computer-readable storage medium, comprising a stored computer program, wherein when the computer program runs, a device on which the computer-readable storage medium is located is controlled to execute the data processing method based on multi-layer cache according to any one of claims 1 to 7.
CN202210497433.7A 2022-05-09 2022-05-09 Data processing method and device based on multilayer cache and electronic equipment Pending CN114780043A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210497433.7A CN114780043A (en) 2022-05-09 2022-05-09 Data processing method and device based on multilayer cache and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210497433.7A CN114780043A (en) 2022-05-09 2022-05-09 Data processing method and device based on multilayer cache and electronic equipment

Publications (1)

Publication Number Publication Date
CN114780043A true CN114780043A (en) 2022-07-22

Family

ID=82436792

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210497433.7A Pending CN114780043A (en) 2022-05-09 2022-05-09 Data processing method and device based on multilayer cache and electronic equipment

Country Status (1)

Country Link
CN (1) CN114780043A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905114A (en) * 2023-03-09 2023-04-04 浪潮电子信息产业股份有限公司 Batch updating method and system of metadata, electronic equipment and readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905114A (en) * 2023-03-09 2023-04-04 浪潮电子信息产业股份有限公司 Batch updating method and system of metadata, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
US11461202B2 (en) Remote data replication method and system
US7840536B1 (en) Methods and apparatus for dynamic journal expansion
JP5007350B2 (en) Apparatus and method for hardware-based file system
KR101137299B1 (en) Hierarchical storage management for a file system providing snapshots
US8504571B2 (en) Directed placement of data in a redundant data storage system
JP5254611B2 (en) Metadata management for fixed content distributed data storage
US20060047926A1 (en) Managing multiple snapshot copies of data
KR100404555B1 (en) Data processor storage systems with dynamic resynchronization of mirrored logical data volumes subsequent to a storage system failure
US20050234867A1 (en) Method and apparatus for managing file, computer product, and file system
CN113722275B (en) Object storage space management method, device, server and storage medium
US9547706B2 (en) Using colocation hints to facilitate accessing a distributed data storage system
TW200303468A (en) Deferred copy-on-write of a snapshot
JP2007241486A (en) Memory system
US9619322B2 (en) Erasure-coding extents in an append-only storage system
CN115599747B (en) Metadata synchronization method, system and equipment of distributed storage system
US10387384B1 (en) Method and system for semantic metadata compression in a two-tier storage system using copy-on-write
CN114780043A (en) Data processing method and device based on multilayer cache and electronic equipment
CN115168367B (en) Data configuration method and system for big data
CN109508140B (en) Storage resource management method and device, electronic equipment and system
CN113204520B (en) Remote sensing data rapid concurrent read-write method based on distributed file system
US10628391B1 (en) Method and system for reducing metadata overhead in a two-tier storage architecture
CN115437550A (en) Writing method of storage system and writing method of distributed storage system
JP2005316624A (en) Database reorganization program, database reorganization method, and database reorganization device
US20200285620A1 (en) Content-based data migration
US11966637B1 (en) Method and system for storing data in portable storage devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination