CN114089923A - Double-live storage system and data processing method thereof - Google Patents

Double-live storage system and data processing method thereof Download PDF

Info

Publication number
CN114089923A
CN114089923A CN202111433614.5A CN202111433614A CN114089923A CN 114089923 A CN114089923 A CN 114089923A CN 202111433614 A CN202111433614 A CN 202111433614A CN 114089923 A CN114089923 A CN 114089923A
Authority
CN
China
Prior art keywords
data
storage
logical volume
redundancy
site
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111433614.5A
Other languages
Chinese (zh)
Inventor
朱广帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN202111433614.5A priority Critical patent/CN114089923A/en
Publication of CN114089923A publication Critical patent/CN114089923A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Abstract

The application provides a double-live storage system and a data processing method thereof, wherein the double-live storage system comprises: the system comprises a first storage site and a second storage site, wherein the second storage site is a standby storage site of the first storage site; a first resource pool is created in the first storage site, the first resource pool being configured as a first redundancy policy; a second resource pool is created in the second storage site, and the second resource pool is configured as a second redundancy strategy; the first resource pool is provided with a first logical volume, the second resource pool is provided with a second logical volume, the first logical volume and the second logical volume are configured to be double live volumes, and the double live volumes are used for recording a group PG log and providing data caching. The scheme reduces the bandwidth occupation of the distributed double-active storage system and improves the system performance and reliability.

Description

Double-live storage system and data processing method thereof
Technical Field
The application relates to the technical field of data storage, in particular to a double-active storage system and a data processing method thereof.
Background
With the integration of informatization technology into hundreds of industries and the integration of informatization technology into the daily life of people, storage systems play an increasingly important role in key services of various industries, and enterprises have unprecedented requirements on service continuity. Especially in the fields of communication, finance, medical treatment, government offices, logistics, electronic commerce and the like, important data can be lost due to service interruption of the storage system, enterprise credit is greatly reduced, and huge economic loss is caused. Therefore, ensuring business continuity is a key to storage system construction, and therefore dual-activity technology is timely and timely in this context.
The double-Active technology is widely accepted in the field of storage, and is commonly called Active-Active technology. The dual-active technology provides a flexible and powerful data disaster tolerance function for users, can realize real-time data synchronous replication, real-time service running state monitoring and fault switching between two data centers, and ensures that the users can realize cross-data-center service switching and service load sharing on line.
For example, a traditional disaster recovery solution deployment method usually includes a production center and a disaster recovery center, the disaster recovery center is usually in an off-state, and only when a disaster occurs, the production center is paralyzed and the disaster recovery center is started. Such disaster recovery systems face the following challenges: when a production center encounters disasters such as flood, fire, artificial disaster, earthquake and the like, the service needs to be manually switched to the disaster backup center in the disaster backup center, the service interruption time is long, and generally, the RTO (recovery time Object) is in an hour level, so that the continuous operation of the service cannot be guaranteed. The disaster recovery center is in an idle state throughout the year, the resource utilization rate is low, and the Total Cost of Ownership (Total Cost of Ownership) is increased.
Disclosure of Invention
The application aims to provide a double-active storage system and a data processing method thereof, so as to reduce the bandwidth occupation of the distributed double-active storage system, thereby improving the system performance and reliability.
A first aspect of the present application provides a dual-active storage system, which is based on a distributed storage method, and includes:
the system comprises a first storage site and a second storage site, wherein the second storage site is a standby storage site of the first storage site;
a first resource pool is created in the first storage site, the first resource pool being configured as a first redundancy policy;
a second resource pool is created in the second storage site, and the second resource pool is configured as a second redundancy strategy;
the first resource pool is provided with a first logical volume, the second resource pool is provided with a second logical volume, the first logical volume and the second logical volume are configured to be double live volumes, and the double live volumes are used for recording a group PG log and providing data caching.
A second aspect of the present application provides a data processing method applied to the dual active storage system of the first aspect, where the method includes:
the first logical volume receives first data sent by a first storage gateway and a processing result of the first data after the first data is processed according to a Ceph index rule;
the first logical volume stores first data to the first storage site according to a processing result of the first data, records a PG log in the processing result, and the first storage site performs redundancy protection on the first data according to a first redundancy strategy;
and the first logical volume sends the first data and the processing result thereof to the second logical volume, so that the second logical volume stores the first data to the second storage site according to the processing result of the first data and records the PG log in the processing result, and the second storage site performs redundancy protection on the first data according to a second redundancy strategy.
A third aspect of the present application provides a data processing method applied to the dual active storage system of the first aspect, where the method includes:
the second logical volume receives second data sent by a second storage gateway and a processing result of the second data after the second data is processed according to a Ceph index rule;
the second logical volume locally caches second data and records PG logs in the processing result;
the second logical volume sends second data and a processing result thereof to the first logical volume, so that the first logical volume stores the second data to the first storage site according to the processing result of the second data and records a PG log in the processing result, and the first storage site performs redundancy protection on the second data according to a first redundancy strategy;
and after receiving the control message sent by the first logical volume, the second logical volume stores the second data cached locally to the second storage site, and the second storage site performs redundancy protection on the second data according to a second redundancy strategy.
Compared with the prior art, the double-live storage system provided by the application has the following beneficial effects:
1. two-copy double-live volume is deployed across the cluster, the storage volume does not store real data, only records metadata, and simultaneously can provide a cache function.
2. The double live volumes ensure cross-site data flow by using the strong consistency of two copies of Ceph, realize real-time synchronization of data cross-site and automatic synchronization after fault recovery, and ensure that the data of the main and standby sites are consistent in real time under the normal condition of the cluster, and RPO is 0.
3. And the number of complete copies of the data is stored in a single site, and the maximum reliability of the data is ensured by using the number of complete redundant copies.
4. Independent data redundancy strategies are configured for each main station and each standby station in the dual-active storage system, and the redundancy strategies can be flexibly selected.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a schematic diagram of a cluster structure of a conventional dual active storage system;
FIG. 2 is a flow chart illustrating a data write operation of the dual-active memory system of FIG. 1;
FIG. 3 is a schematic diagram illustrating a cluster structure of a dual active storage system provided in the present application;
FIG. 4 is a flow chart illustrating a data processing method provided herein;
FIG. 5 is a flow chart illustrating another data processing method provided herein;
FIG. 6 is a flow chart illustrating another data write operation of the dual active memory system of FIG. 3.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which this application belongs.
In addition, the terms "first" and "second", etc. are used to distinguish different objects, rather than to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
For ease of understanding, some technical terms in the present application will be first introduced.
The Ceph (distributed storage system) is an open source project, provides a software-defined and unified storage solution, and has the advantages of large-scale expansion, high performance and no single point of failure. When an application accesses the Ceph cluster and performs a write operation, data is stored as an Object (Object) in an Object Storage Device (OSD) of the Ceph. The Ceph Monitor is responsible for the health status of the whole cluster, and generally, the Monitor node can be deployed on a physical host alone, or both the Monitor node and the storage node can be deployed on the physical host. In the Ceph cluster, several monitors are responsible for managing, maintaining and publishing the state information of the cluster together.
In the Ceph storage, data is stored in basic units of objects, each object defaults to 4MB in size; several objects belong to one PG (place Group); the PGs belong to one OSD; generally, one OSD corresponds to one disk. Ceph employs a hierarchical Cluster structure (Cluster Map) and a user can customize the Cluster structure, and OSD is just a leaf node of this hierarchical Cluster structure.
On a Ceph cluster, several resource pools (Pool) can be established, each of which needs to indicate the PG number of the Pool at the time of creation, and the Pool is a logical concept.
Fig. 1 is a schematic diagram of a cluster structure of an existing dual active storage system, where, as shown in fig. 1, the dual active storage system provides storage services in a dual data center in the same city based on a storage service cluster remote mode, and a deployment mode is as follows:
1. taking deployment of a storage cluster 6 node (node) as an example, node1, node2 and node3 are deployed on a storage site a and are deployed on two racks; node4, node5 and node6 are deployed at storage site B, two racks and a mediation server is deployed at site C.
2. The storage gateways are deployed across sites to form a cluster, and any gateway can provide storage cluster service;
3. the storage service cluster is deployed across sites to form a cluster, a storage fault domain is set as a rack by taking deployment of 4 copies as an example, and the number of copies is set to be 4; i.e., the 4 copies, there are 2 copies of data running in site a and 2 other copies of data running in site B.
4. The storage control service mon node (Monitor) is deployed in a three-node cluster, mon1 is deployed on one server of site A, mon2 is deployed on one server of site B, and mon3 is deployed on an arbitration server of site C, so that the phenomenon of split storage is prevented from occurring, and the failure of a data storage function is avoided.
5. When a disaster occurs at a site A or a site B, the cross-site cluster disaster recovery system of the storage system can realize service fault recovery with RPO (remote procedure order) 0 and RTO (remote procedure order) 0, and ensure that the storage system has no loss of stored data and no service perception and runs continuously.
Fig. 2 shows a data write operation flow of the dual active memory system in fig. 1, which includes the following specific flows:
1. the host side (client) issues the data to be written and the metadata information to the storage side through the gateway, and the gateway receives and processes the data.
2. The gateway calculates the disk into which the data to be written needs to fall according to a certain index rule, as shown in fig. 2: assume that the disks to fall into are: the 4-copy relations of node3-HDD1, node2-HDD2, node4-HDD1 and node5-HDD2, which need to be written, are as follows: the node3-HDD1 is a master copy, and the node2-HDD2, node4-HDD1 and node5-HDD2 are backup copies.
3. The gateway writes the data over the network to the storage service process where the primary replica node3-HDD1 resides.
4. The storage service process deployed at the node3-HDD1 writes data into the storage service processes at the node2-HDD2, the node4-HDD1 and the node5-HDD2 respectively and simultaneously according to the data redundancy strategy;
5. the storage service process writes data to a disk according to certain logic processing;
6. after 4 parts of data are completely written successfully, the node3-HDD1 considers that the data are successfully written, and returns the successful writing information to the host side until the data are successfully written;
from the above data writing operation flow, it can be known that the existing dual-active storage system has the following disadvantages:
the first disadvantage is that: host performance degradation: because the number of copies is 4, 2 identical data which need to be transmitted to the site B are respectively written into different disks, so that the performance of the host is reduced;
the second disadvantage is that: and (3) increasing the link construction investment cost: redundant data volume is transmitted, so that the bandwidth of a link needs to be doubled, link resources are wasted, and construction cost is increased;
the third disadvantage is that: reliability is reduced, a site fault causes that double copies are unavailable, and the risk of downtime of continuous operation of a service is increased;
the defect four is as follows: only a cross-cluster copy scheme is supported, and Erasure Code (EC) redundancy is not supported, so that space occupation is increased.
Of course, the above scheme can be simplified to 2 copies, but the data reliability is reduced, for example: the number of data copies is set to be 2, at this time, one copy is respectively used by a site A and a site B, if the site A fails, the site B takes over services, only one copy is caused to provide storage services, the risk of single-point failure exists, and if a hard disk or a server where the copy is located fails, no data storage capacity of the services is available at this time, and the service is down.
In view of the above, embodiments of the present application provide a dual active storage system and a data processing method thereof, which are described below with reference to the accompanying drawings.
Please refer to fig. 3, which illustrates a schematic cluster structure diagram of a dual active storage system according to some embodiments of the present application, where the dual active storage system is based on a distributed storage manner, for example, the dual active storage system may be a Ceph manner, and may also be another distributed storage manner, which is not limited in this application.
As shown in fig. 3, the dual active storage system includes: a first storage site 100 and a second storage site 200, and the second storage site 200 is a spare storage site of the first storage site 200;
a first resource pool 110 is created in the first storage site 100, and the first resource pool 110 is configured as a first redundancy policy; a second resource pool 210 is created in the second storage site 200, and the second resource pool 210 is configured as a second redundancy policy;
for example, as shown in fig. 3, the first resource pool 110 includes storage nodes 1, node2, and node3, and the second resource pool 210 includes storage nodes 4, node5, and node6, where each storage node includes two disks, and each disk corresponds to one object storage device OSD.
Specifically, the first redundancy policy may be duplicate redundancy or erasure code redundancy, and the second redundancy policy may be duplicate redundancy or erasure code redundancy. For example, there are three combinations: site 100 configures copy redundancy, and site 200 configures copy redundancy; station 100 configures copy redundancy, and station 200 configures erasure code redundancy; station 100 configures erasure code redundancy and station 200 configures erasure code redundancy.
A first logical volume is created in the first resource pool 110, a second logical volume is created in the second resource pool 210, the first logical volume and the second logical volume are configured as a dual live volume, and the dual live volume is used for recording PG logs (pglogs) and providing data caching.
Specifically, the back-end storage of the double live volumes may be a distributed cache, such as a cache tier; it may also be a distributed database, such as mondb.
For example, a logical double live pool PoolAB1 is created, and the redundancy policy is configured as 2-copy, so as to implement a 2-copy mechanism and a data strong consistency mechanism by using the existing Crush algorithm. The designation of the Crush algorithm is set to be local and dominant according to the planning of a data sending end. The Crush algorithm is a tool used to calculate on which OSD an object is distributed.
Specifically, a local resource pool PoolA1 is created at the site 100, the redundancy policy is configured as copy redundancy, the number of copies is set to 3, a local resource pool PoolB1 is created at the site 200, the redundancy policy is configured as copy redundancy, and the number of copies is set to 2;
creating a volume rbdA1 in site 100/PoolA1 and a volume rbdB1 in site 200/PoolB 1; creating a live relationship: selecting 100/PoolA1/rbdA1 and 200/PoolB1/rbdB1 to create a double-living relationship in a double-living pool PoolAB 1; at this time, the double-live relationship object 100/PoolA1/rbdA1-200/PoolB1/rbdB1 is generated in the PG object in the double-live pool, and the key value is used for recording the PGlog of the subsequent write operation.
Aiming at write operation, the specific responsibility of the logic double-active pool provides a PG mechanism for double-active relationship, which is used for realizing a copy redundancy mechanism and ensuring strong consistency of data, but data is not directly stored, the data needing to be written is written into a specified resource space through an index relationship, and three types of data need to be stored:
first, write this site resource pool: the redundancy strategy and the redundancy level can be configured at will for an independent storage resource pool, write operation is carried out from an object, an object name is calculated according to a volume name, a Logical Block Address (LBA) and a length, the object name is written to a main OSD through a Crush algorithm, and then redundancy protection is carried out through a PG where the object is located and the object is written to a disk. The object provides a uniform interface for the read-write request of the client.
The second type is that a peer site resource pool is written, PG in a live pool is responsible for sending write operation to a target site for processing, PG where the target site is located indexes data to local resources 200/PoolB1rbdB1 or 100/PoolA1/rbdA1 for write operation, write operation is issued from an object, an object name is calculated according to a volume name, LBA and length h, the object name is written to a main OSD through a Crush algorithm, and redundancy protection is carried out through PG where the object is located and written to a disk; and a more efficient distributed cache technology can be used for the back-end device, so that the IO performance is improved.
Third, double living pool PGlog: the method is used for recording that data is changed, and the main recording keywords are volume name, LBA, length and writing sequence number; because the dual active pool is a logical pool and does not actually store data, the place written by the PGlog can be made as required, such as a distributed database, a pool in which the local resources of the dual active member volume are multiplexed, other pools, a copy pool established separately, a cache, a distributed cache, and the like.
In the double-live storage system provided by the application, when a certain site fails, the data single site is written in, the double-live pool records the PGlog, when the sites are all recovered, data recovery is carried out according to the PGlog, and data are read from one end with the PGlog and synchronized to the site at the other end.
In one possible implementation manner, the above dual active storage system provided in the present application further includes: an arbitration station 300; specifically, the first storage site 100 is deployed with a first monitor mon1, and the second storage site is deployed with a second monitor mon 2; the arbitration station is provided with an arbitration monitor mon3, and the arbitration monitor is used for providing arbitration services for the first monitor and the second monitor and preventing the split brain phenomenon.
As shown in fig. 3, the first monitor mon1 is deployed on the node2, the second monitor mon2 is deployed on the node5, and the mediation monitor mon3 is deployed on the mediation server.
In one possible implementation manner, the above dual active storage system provided in the present application further includes: the first storage gateway 120 is configured to receive first data sent by a client, process the first data according to a Ceph index rule to obtain a processing result, and send the processing result and the first data to the first logical volume, where the processing result includes a PGlog and a stored object storage device OSD.
In one possible implementation manner, the above dual active storage system provided in the present application further includes: and the second storage gateway 220 is configured to receive second data sent by the client, process the second data according to the Ceph index rule to obtain a processing result, and send the processing result and the second data to the second logical volume.
In a possible implementation manner, in the above dual-active storage system provided in this application, the first storage site 100 is an independent protection domain, and the second storage site 200 is another independent protection domain.
In the application, the Ceph cluster is divided into two protection domains, specifically, the site 100 is an independent protection domain, the site 200 is an independent protection domain, the division is performed, data redundancy of a local site is prevented from being distributed across sites, the protection domain is a logic concept set for improving the reliability of the cluster, one data (including a copy or a fragment) only exists in one protection domain, and heartbeat detection is also performed in the protection domain.
Compared with the prior art, the double-live storage system provided by the application has the following beneficial effects:
1. two-copy double-live volume is deployed across the cluster, the storage volume does not store real data, only records metadata, and simultaneously can provide a cache function.
2. The double live volumes ensure cross-site data flow by using the strong consistency of two copies of Ceph, realize real-time synchronization of data cross-site and automatic synchronization after fault recovery, and ensure that the data of the main and standby sites are consistent in real time under the normal condition of the cluster, and RPO is 0.
3. And the number of complete copies of the data is stored in a single site, and the maximum reliability of the data is ensured by using the number of complete redundant copies.
4. Independent data redundancy strategies are configured for each main station and each standby station in the dual-active storage system, and the redundancy strategies can be flexibly selected.
In the foregoing embodiment, a dual active storage system is provided, and correspondingly, the present application further provides two data processing methods based on the dual active storage system, where one is a data processing method after a write operation of a client is issued to a first storage gateway, and the other is a data processing method after a write operation of a client is issued to a second storage gateway.
Specifically, after the write operation of the client is issued to the first storage gateway, as shown in fig. 4, the data processing method includes the following steps:
s101, the first logical volume receives first data sent by a first storage gateway and a processing result of the first data after the first data is processed according to a Ceph index rule;
s102, the first logical volume stores first data to the first storage site according to a processing result of the first data, and records a PG log in the processing result, and the first storage site performs redundancy protection on the first data according to a first redundancy strategy;
s103, the first logical volume sends the first data and the processing result thereof to the second logical volume, so that the second logical volume stores the first data to the second storage site according to the processing result of the first data and records a PG log in the processing result, and the second storage site performs redundancy protection on the first data according to a second redundancy strategy.
As shown in fig. 3, the data processing flow is as follows:
firstly, a business client sends a write operation to a first storage gateway, and the first storage gateway processes first data to obtain a processing result;
the first storage gateway sends the first data and the processing result thereof to a copy master (a first logical volume);
thirdly, the primary replica local caches the first data and records a PGlog according to a PG mechanism; data writing and redundancy protection of a first storage station;
fourthly, the copy master sends the first data and the processing result thereof to a copy backup (a second logical volume) for processing;
fifthly, copying first data for local cache, recording PGlog according to a PG mechanism, and realizing double-active data management logic;
and sixthly, writing data in the second storage station and performing redundancy protection, wherein the copy master receives the write completion message and returns the write completion message to the client.
Specifically, after the write operation of the client is issued to the second storage gateway, as shown in fig. 5, the data processing method includes the following steps:
s201, the second logical volume receives second data sent by a second storage gateway and a processing result of the second data after the second data is processed according to a Ceph index rule;
s202, the second logical volume locally caches second data and records PG logs in the processing result;
s203, the second logical volume sends second data and a processing result thereof to the first logical volume, so that the first logical volume stores the second data to the first storage site according to the processing result of the second data and records a PG log in the processing result, and the first storage site performs redundancy protection on the second data according to a first redundancy strategy;
and S204, after receiving the control message sent by the first logical volume, the second logical volume stores the second data cached locally to the second storage site, and the second storage site performs redundancy protection on the second data according to a second redundancy strategy.
As shown in fig. 6, the data processing flow is as follows:
firstly, a service client sends a write operation to a second storage gateway, and the second storage gateway processes second data to obtain a processing result;
the second storage gateway sends the second data and the processing result thereof to the copy device;
thirdly, the copy backup locally caches the second data and records the PGlog according to a PG mechanism;
fourthly, the duplicate backup sends the second data and the processing result thereof to the duplicate main processing;
the primary copy locally caches second data, and records PGlog according to a PG mechanism to realize double-active data management logic;
writing data in the first storage station, performing redundancy protection, and sending a control message to a copy for data writing;
and data writing and redundancy protection of the second storage site are realized, the copy master receives the all-writing completion message, the copy backup receives the writing completion message, and the copy backup receives the client writing completion message.
According to the data processing method of the double-active storage system, cross-site virtual machine copy protection and single-site independent data redundancy protection are achieved, and a cross-site double-active technology under distributed storage is achieved; based on the existing PG mechanism of Ceph, the cross-site double-activity function is realized through the data double-activity layer PG and the data storage layer PG. According to the double-active requirement, a Crush algorithm of a double-active pool is set to realize that the copy is mainly at an expected site, so that the data reading and writing performance is improved; the data is copied once across the sites, and the double-live-metadata is stored extremely simply without occupying additional network resources and storage space; a copy protection mechanism among original OSD is abstracted to a client layer to reserve a PGlog mechanism, real data storage is not carried out, and a cross-site data double write-strength consistency mechanism is realized. Modification of PG mechanism: when the primary replica and the secondary replica are used for synchronizing data, real data can be selected to be synchronized or control information of downloading can be sent according to the data identification, and high-efficiency strong consistency of data among the cross-site replicas is achieved.
Finally, it should be noted that: the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present disclosure, and the present disclosure should be construed as being covered by the claims and the specification.

Claims (10)

1. A dual-active storage system based on a distributed storage mode is characterized by comprising: the system comprises a first storage site and a second storage site, wherein the second storage site is a standby storage site of the first storage site;
a first resource pool is created in the first storage site, the first resource pool being configured as a first redundancy policy;
a second resource pool is created in the second storage site, and the second resource pool is configured as a second redundancy strategy;
the first resource pool is provided with a first logical volume, the second resource pool is provided with a second logical volume, the first logical volume and the second logical volume are configured to be double live volumes, and the double live volumes are used for recording a group PG log and providing data caching.
2. The dual live storage system of claim 1, wherein the back end storage of the dual live volume is a distributed cache or a distributed database.
3. The dual active storage system of claim 1, wherein the first redundancy policy is replica redundancy or erasure code redundancy.
4. The dual active storage system of claim 1, wherein the second redundancy policy is replica redundancy or erasure code redundancy.
5. The dual live storage system of claim 1, further comprising: arbitrating the site;
the first storage site is provided with a first monitor, and the second storage site is provided with a second monitor; the arbitration station is deployed with an arbitration monitor for providing arbitration services to the first monitor and the second monitor.
6. The dual live storage system of claim 1, further comprising:
and the first storage gateway is used for receiving the first data sent by the client, processing the first data according to the Ceph index rule to obtain a processing result, and sending the processing result and the first data to the first logical volume, wherein the processing result comprises a PG log and a stored object storage device OSD.
7. The dual live storage system of claim 1, further comprising:
and the second storage gateway is used for receiving the second data sent by the client, processing the second data according to the Ceph index rule to obtain a processing result, and sending the processing result and the second data to the second logical volume.
8. The dual active storage system as claimed in claim 1, wherein the first storage site is an independent protection domain and the second storage site is another independent protection domain.
9. A data processing method applied to the dual active storage system according to any one of claims 1 to 8, the method comprising:
the first logical volume receives first data sent by a first storage gateway and a processing result of the first data after the first data is processed according to a Ceph index rule;
the first logical volume stores first data to the first storage site according to a processing result of the first data, records a PG log in the processing result, and the first storage site performs redundancy protection on the first data according to a first redundancy strategy;
and the first logical volume sends the first data and the processing result thereof to the second logical volume, so that the second logical volume stores the first data to the second storage site according to the processing result of the first data and records the PG log in the processing result, and the second storage site performs redundancy protection on the first data according to a second redundancy strategy.
10. A data processing method applied to the dual active storage system according to any one of claims 1 to 8, the method comprising:
the second logical volume receives second data sent by a second storage gateway and a processing result of the second data after the second data is processed according to a Ceph index rule;
the second logical volume locally caches second data and records PG logs in the processing result;
the second logical volume sends second data and a processing result thereof to the first logical volume, so that the first logical volume stores the second data to the first storage site according to the processing result of the second data and records a PG log in the processing result, and the first storage site performs redundancy protection on the second data according to a first redundancy strategy;
and after receiving the control message sent by the first logical volume, the second logical volume stores the second data cached locally to the second storage site, and the second storage site performs redundancy protection on the second data according to a second redundancy strategy.
CN202111433614.5A 2021-11-29 2021-11-29 Double-live storage system and data processing method thereof Pending CN114089923A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111433614.5A CN114089923A (en) 2021-11-29 2021-11-29 Double-live storage system and data processing method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111433614.5A CN114089923A (en) 2021-11-29 2021-11-29 Double-live storage system and data processing method thereof

Publications (1)

Publication Number Publication Date
CN114089923A true CN114089923A (en) 2022-02-25

Family

ID=80305576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111433614.5A Pending CN114089923A (en) 2021-11-29 2021-11-29 Double-live storage system and data processing method thereof

Country Status (1)

Country Link
CN (1) CN114089923A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453146A (en) * 2023-12-22 2024-01-26 芯能量集成电路(上海)有限公司 Data reading method, system, eFlash controller and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453146A (en) * 2023-12-22 2024-01-26 芯能量集成电路(上海)有限公司 Data reading method, system, eFlash controller and storage medium
CN117453146B (en) * 2023-12-22 2024-04-05 芯能量集成电路(上海)有限公司 Data reading method, system, eFlash controller and storage medium

Similar Documents

Publication Publication Date Title
US11567674B2 (en) Low overhead resynchronization snapshot creation and utilization
US9678686B2 (en) Managing sequentiality of tracks for asynchronous PPRC tracks on secondary
US9940205B2 (en) Virtual point in time access between snapshots
US6823349B1 (en) Method and system for establishing, maintaining, and using a persistent fracture log
US9501542B1 (en) Methods and apparatus for volume synchronization
US8904231B2 (en) Synchronous local and cross-site failover in clustered storage systems
US8473465B2 (en) Data mirroring system
US7120824B2 (en) Method, apparatus and program storage device for maintaining data consistency and cache coherency during communications failures between nodes in a remote mirror pair
US9563516B2 (en) Managing backup operations from a client system to a primary server and secondary server
US8521691B1 (en) Seamless migration between replication technologies
US6035412A (en) RDF-based and MMF-based backups
CN101577735B (en) Method, device and system for taking over fault metadata server
US8214685B2 (en) Recovering from a backup copy of data in a multi-site storage system
US20100030754A1 (en) Data Backup Method
CN106407040A (en) Remote data copy method and system
CN101808127B (en) Data backup method, system and server
US7761431B2 (en) Consolidating session information for a cluster of sessions in a coupled session environment
CN103942112A (en) Magnetic disk fault-tolerance method, device and system
US7979396B1 (en) System and method for performing consistent resynchronization between synchronized copies
US20090063486A1 (en) Data replication using a shared resource
US7412577B2 (en) Shared data mirroring apparatus, method, and system
US20040225914A1 (en) Method, apparatus and program storage device for allowing continuous availability of data during volume set failures in a mirrored environment
US7080197B2 (en) System and method of cache management for storage controllers
CN114089923A (en) Double-live storage system and data processing method thereof
US9582384B2 (en) Method and system for data replication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination