CN105577776A - Distributed storage system and method based on data arbiter copy - Google Patents

Distributed storage system and method based on data arbiter copy Download PDF

Info

Publication number
CN105577776A
CN105577776A CN201510955956.1A CN201510955956A CN105577776A CN 105577776 A CN105577776 A CN 105577776A CN 201510955956 A CN201510955956 A CN 201510955956A CN 105577776 A CN105577776 A CN 105577776A
Authority
CN
China
Prior art keywords
data
module
memory module
trnascription
arbitrator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510955956.1A
Other languages
Chinese (zh)
Inventor
雍帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Eisoo Information Technology Co Ltd
Original Assignee
Shanghai Eisoo Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Eisoo Information Technology Co Ltd filed Critical Shanghai Eisoo Information Technology Co Ltd
Priority to CN201510955956.1A priority Critical patent/CN105577776A/en
Publication of CN105577776A publication Critical patent/CN105577776A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

The invention provides a distributed storage system and method based on a data arbiter copy and used for online distributed data storage. The method comprises the following steps that: S1, the distributed storage system receives a data write-in operation request; S2, the distributed storage system starts data write-in operation after receiving the data write-in operation request and synchronously writes the written data into a data copy storage module; S3, the data write-in operation of the data copy storage module is completed, and the data copy storage module directly returns writing operation confirmation information; S4, a data arbiter copy module continuously and asynchronously write in data from the data copy storage module, and a complete data arbitration information backup used for data migration and restoration is carried out; and S5, the write-in operation of the data arbiter copy module is completed, and asynchronous writing operation confirmation information is returned. The distributed storage system and method solve the problems in the prior art that an existing distributed storage system has a relatively small effectively storage space, is low in reading and writing efficiency and has difficulty in data restoration, and a conventional double-copy storage method leads to a brain-split phenomenon.

Description

Based on distributed memory system and the method for data arbitrator copy
Technical field
The present invention relates to distributed memory system, particularly relate to the distributed memory system based on data arbitrator copy and method.
Background technology
Along with the development of information technology, global metadata is explosive growth.Centralised storage server excessively relies on traditional magnetic memory device, operation inconvenience, performance is difficult to be expanded, and the online centralised storage server occurred in the recent period, technology is needed badly perfect, cannot meet system highly reliable, the requirements such as extensibility, and can be performance focus, the situation of extremely makeing mistakes happens occasionally.Online distributed storage well solves these problems.
Ceph is a set of univesral distribution formula storage system of increasing income, there is excellent data access performance, system reliability has obvious lifting, there is the extensibility of memory property simultaneously, in ecommerce, online kernel business system and the field such as mass data storage and analysis have and use widely, and obtain the support of a large amount of tissues of increasing income, and accelerate the universal of new distribution type storage system.Although ceph, as a successful open source projects, obtains the extensive use of every field staff, also there are some problems in ceph, haves much room for improvement.Use the storage policy of the multiple copies backup of individual data storage object data in ceph, ensure the highly reliable of data with this, when occur going offline wait abnormal, not easily loss storage information.Distributed storage " fissure " problem for avoiding the solution of two cover copies to cause in actual production environment, the method of general use three copy ensure that the higher reliability that data store, but three replication policies are under the condition of high performance distributed block memory device, its effective data storage capacity is by excessive compression, the too high inferior position of hardware cost also clearly, is unfavorable for promotion and implementation in actual production process and comes into operation.And the storage means of three copies causes storage operation object, operational history and the informational needs that reads and writes data to be exaggerated writes three times, the enforcement of guarantee three copy method, in the problem that write efficiency is not high yet.
What in the read-write flow process of data, ceph adopted is the elementary tactics of ROWA (Read-One-Write-All), the information that storage data object comprises must be write all copies and back up by Write operation, and all Read operations are all read from primary copy and stored data object and list information.The client that storage operation occurs will directly communicate with data backup module display platform OSD (on-screemdisplay) 1, client sends write operation request to distributed storage server, initiates data object write operation (Write1).Data trnascription memory module display OSD (Object-basedStorageDevice, object-based storage equipment) 1 receive write operation request after, respectively to data trnascription memory module OSD (Object-basedStorageDevice, object-based storage equipment) 2 and data trnascription memory module OSD (Object-basedStorageDevice, object-based storage equipment) 3 initiate data object write operation (Write2, Write3).As data trnascription memory module OSD (Object-basedStorageDevice, object-based storage equipment) 2 and data trnascription memory module display platform OSD (Object-basedStorageDevice, object-based storage equipment) 3 complete data object write operation separately after, will respectively to data trnascription memory module display platform OSD (Object-basedStorageDevice, object-based storage equipment) the 1 corresponding confirmation (Ack4, Ack5) being sent completely write.When data trnascription memory module display platform OSD (on-screemdisplay) 1 be sure of other two data trnascription memory module display platform OSD (Object-basedStorageDevice, object-based storage equipment) data object write operation complete after, then data trnascription memory module OSD (Object-basedStorageDevice, object-based storage equipment) 1 also complete data write, and confirm that data object write operation completes (Ack6) to client.Such realization ensures the reliability in data object write operation process, issuable data loss problem when avoiding causing distributed memory system to run into exception as far as possible, simplify the mechanical complications of system realization and realize difficulty, limited extent improves the reading performance of data, but because all copies of needs all write, data certainly will be caused to write affirmation mechanism lengthy and jumbled, data object write operation time delay is comparatively large, the problem that entire system average input and output IOPS (Input/OutputPerSecend) per second is not high.
To sum up, to be ceph distributed memory system in prior art to data store and in the control method of input and output and writing and reading operation, for the storage policy preventing the fissure problem of this generation of two-pack from adopting three copies, thus create the problem that write efficiency is not high and memory device instream factor is low, improve the redundancy capacity cost of system, write performance is relatively low.The write time delay using three copy wiring methods also can produce and the slow problem of input and output per second.
Summary of the invention
In view of the shortcoming of above prior art, the object of the present invention is to provide based on data arbitrator copy distributed storage system and method, for solving the problems such as fissure that in the relatively little and prior art of effective memory space, this storage means of two-pack exists.
For achieving the above object and other relevant objects, the invention provides a kind of based on data arbitrator copy distributed storage system and method, store data for distribution on line formula, the distributed storage method based on data arbitrator copy comprises the following steps: S1, distributed memory system receive data write operation request; After S2, distributed memory system receive write operation request, start data write operation, and the data syn-chronization of write is write in data trnascription memory module; The data write operation of S3, data trnascription memory module completes, and data trnascription memory module directly postbacks write operation confirmation; S4, data arbitrator transcript module continue to write data asynchronously data trnascription memory module, carry out complete data arbitration information backup; For supported data migration and reparation; S5, the write operation of data arbitrator transcript module complete, and postback asynchronous write operation acknowledgement information.
In one embodiment of the present invention, data write operation flow process adopts Write-Quorum data write flow process, if data trnascription memory module adds up to n, after the data trnascription memory module number then completing data write operation is greater than n/2, directly return write operation confirmation, the data message in residue transcript module writes asynchronously data trnascription memory module.
In one embodiment of the present invention, data arbitration information is operation historical record information, for repair data copy memory module because of the abnormal data do not write; The metadata of data object, data volume and operation historical record information is all write in data trnascription memory module; Write operation is not carried out to data volume in data arbitrator transcript module.
In one embodiment of the present invention, the distributed storage method based on data arbitrator copy is further comprising the steps of: S1 ', judge whether that there is data trnascription memory module rolls off the production line; S2 ', if so, then the distributed storage decorum gets rid of the data trnascription memory module rolled off the production line, adjustment write operation for data trnascription memory module number; Whether the data trnascription memory module that S3 ', judgement have been rolled off the production line reaches the standard grade again; S4 ', if so, then this data trnascription memory module obtains complete operation historical record information and data volume from other data trnascription memory modules or data arbitrator transcript module, carries out data restore; S5 ', if not, then the data trnascription memory module do not rolled off the production line proceeds data write operation.
In one embodiment of the present invention, step S4 ' also comprises: S41 ', self the operation historical record information of data trnascription memory module traversal rolled off the production line; Operation historical record information in operation historical record information and date arbitrator transcript module in S42 ', the data trnascription memory module that rolls off the production line merges, and obtains complete operation historical record information; S43 ', the data trnascription memory module of again reaching the standard grade supplement metadata and the data volume of write loss from the data trnascription memory module that other do not go offline according to the historical operation recorded information merged.
In one embodiment of the present invention, based on the distributed memory system of data arbitrator copy, data are stored for distribution on line formula, distributed memory system based on data arbitrator copy is installed in server platform, comprising: the first data trnascription memory module, the second data trnascription memory module and data arbitrator transcript module; First data trnascription memory module is for storing metadata to be read, data volume and operation historical record information, the specification of the second data trnascription memory module is identical with the first data trnascription memory module, for backing up metadata, data volume and operation historical record information; Data arbitrator transcript module is for backing up metadata and complete operation historical record information.
In one embodiment of the present invention, first authentic copy module comprises: Operand data memory module, history information module and data volume memory module; Triplicate module comprises: Operand data backup module, history information backup module and data volume backup module.
In one embodiment of the present invention, data arbitrator transcript module comprises: metadata backup module and history information backup module.
In one embodiment of the present invention, the distributed memory system based on data arbitrator copy can expand to distributed storage many live datas center.
In one embodiment of the present invention, distributed storage many live datas center comprises data center module and data arbitrator center module.
As above, provided by the invention based on data arbitrator copy distributed storage system and method, there is following beneficial effect:
In the read-write flow process of data, the information that storage data object comprises is adopted asynchronous mode by the ceph distributed memory system based on data arbitrator copy provided by the invention and method, first the synchronizing information that data storage object comprises is write the first data trnascription memory module and the second data trnascription memory module, again asynchronously by back-up storage in complete operational history write data arbitrator transcript module, all wired the reading from first authentic copy module of all read operations to object data information stores data object information and list information.The client that storage operation occurs first directly communicates with the first data trnascription memory module, and client sends write operation request to the server end of distributed memory system, initiates data object write operation.After first data trnascription memory module receives write operation request, first synchronously send the write operation request corresponding with Backup Data to the second data trnascription memory module, then initiate data object write operation to data arbitrator transcript module asynchronously.After the first data trnascription memory module, the second data trnascription memory module and data arbitrator transcript module complete data object write operation separately, the second data trnascription memory module is sent completely the corresponding confirmation of write to the first data trnascription memory module.The operational history of the first data trnascription memory module and the second data trnascription memory module is complete, and after all data object information write operations complete, server end sends confirmation to client, confirms that data object write operation completes.When there are the abnormal conditions such as going offline appears in a certain data trnascription memory module, do not occur that abnormal data trnascription memory module continues write object data information, realization such after the data trnascription memory module broken down recovers normal operating conditions ensures the reliability in data object write operation process, issuable data loss problem when avoiding causing distributed memory system to run into exception as far as possible, simplify the mechanical complications of system realization and realize difficulty, improve the reading performance of data, solve the affirmation mechanism existed in prior art three replication policy lengthy and jumbled, data object write operation time delay is larger, the problem that entire system average input and output IOPS (Input/OutputPerSecend) per second is not high.
Accompanying drawing explanation
Fig. 1 is shown as the distributed storage method data based on data arbitrator copy of the present invention write schematic diagram.
Fig. 2 is shown as the data write flow chart of the distributed storage method based on data arbitrator copy of the present invention.
Fig. 3 is shown as data trnascription of the present invention and stores information schematic diagram.
Fig. 4 is shown as Data Migration of the present invention and repairs basic step flow chart.
Fig. 5 is shown as the copy repair data detailed step schematic diagram that rolls off the production line of the present invention.
Fig. 6 is shown as the copy repair data sequential chart that rolls off the production line of the present invention.
Fig. 7 is shown as module diagram of the present invention.
Fig. 8 is shown as data center of the present invention expansion embodiment communication scheme.
Element numbers explanation
1 distributed memory system
11 first data trnascription memory modules
22 second data trnascription memory modules
111 data arbitrator transcript module
112 Operand data memory modules
113 history information modules
121 Operand data backup modules
122 history information backup modules
131 metadata backup modules
132 history information backup modules
101 data center module
102 data center module
103 data arbitrator center module
Embodiment
By particular specific embodiment, embodiments of the present invention are described below, person skilled in the art scholar the content disclosed by this specification can understand other advantages of the present invention and effect easily.
Refer to Fig. 1 to Fig. 8.Notice, the structure that this specification institute accompanying drawings illustrates, content all only in order to coordinate specification to disclose, understand for person skilled in the art scholar and read, and be not used to limit the present invention enforceable qualifications, therefore the not technical essential meaning of tool, the adjustment of the modification of any structure, the change of proportionate relationship or size, do not affecting under effect that the present invention can produce and the object that can reach, all should still drop in scope that disclosed technology contents can contain.Simultaneously, quote in this specification as " on ", D score, "left", "right", " centre " and " one " etc. term, also only for ease of understanding of describing, and be not used to limit the enforceable scope of the present invention, the change of its relativeness or adjustment, under changing technology contents without essence, when being also considered as the enforceable category of the present invention.
For achieving the above object and other relevant objects, the invention provides a kind of based on data arbitrator copy distributed storage system and method, store data for distribution on line formula.
Refer to Fig. 2, be shown as the data write flow chart based on the distributed storage method of data arbitrator copy, as shown in Figure 2, the distributed storage method basic step schematic diagram based on data arbitrator copy comprises the following steps: S1, distributed memory system receive data write operation request; After S2, distributed memory system receive write operation request, start data write operation, and the data syn-chronization of write is write in data trnascription memory module; The data write operation of S3, data trnascription memory module completes, and data trnascription memory module directly postbacks write operation confirmation; S4, data arbitrator transcript module write data asynchronously data trnascription memory module, carry out complete data arbitration information backup; For supported data migration and reparation; S5, the write operation of data arbitrator transcript module complete, and postback asynchronous write operation acknowledgement information.Data arbitrator: a kind of data storage object of replacement data copy in a distributed system, only stores the metadata information of data and the autoincrementing operation daily record of data in data arbitrator.Main when data system occurs that abnormal data is repaired, the correctness that the participant's decision data as data restore upgrades.Data arbitrator module is for storing above-mentioned data arbitrator, and supported data moves and repairs.
Refer to Fig. 1, be shown as the distributed storage method data write schematic diagram based on data arbitrator copy, as shown in Figure 1, data write operation flow process adopts Write-Quorum data write flow process, as data trnascription memory module adds up to 3, after the data trnascription memory module number then completing data write operation is greater than 2, directly return write operation confirmation, the data message in residue transcript module writes asynchronously data trnascription memory module.Client will directly communicate with data trnascription memory module 11, initiates write operation (Write1).After data trnascription memory module 11 receives request, respectively to data trnascription memory module 12 and data arbitrator transcript module 13 initiate write operation (Write2, AsyncWrite3) wherein Write12 be synchronous writing operation.AsyncWrite3 is asynchronous write.After data trnascription memory module 12 completes write operation, acknowledge message (Ack4) will be beamed back to data arbitrator transcript module 11.After data trnascription memory module 11 receives the confirmation of data trnascription memory module 12, complete the data write of oneself, and confirm that data object object write operation completes (Ack6) to client.After data arbitrator transcript module 13 completes write operation, to beam back asynchronous confirmation (AsyncAck5) to data trnascription memory module 11, data trnascription memory module 11 is without the need to waiting completing and can directly returning of pending data arbitrator transcript module 13.Under normal circumstances data trnascription memory module 11 and data trnascription memory module 12 are complete data trnascription in above-mentioned flow process, the data in data arbitrator transcript module 13 are data arbitrator.When digital independent, preferentially data can be read from data trnascription memory module 11.As shown in Figure 1, in figure, OSD (Object-basedStorageDrives) 1 is corresponding with data trnascription memory module 11; OSD (Object-basedStorageDrives) 2 is corresponding with data trnascription memory module 12; OSD (Object-basedStorageDrives) 3 is corresponding with data trnascription memory module 13, is more than the reading and writing data flow process of the ROWQ (Read-One-Write-Quorum) after the present invention's improvement.
Refer to Fig. 3, be shown as data trnascription and store information schematic diagram, as shown in Figure 3, data arbitration information is operation historical record information, for repair data copy memory module because of the abnormal data do not write; The metadata of data object, data volume and operation historical record information is all write in all data trnascription memory modules; Write operation is not carried out to data volume in data arbitrator transcript module.
Refer to Fig. 4, be shown as Data Migration and repair basic step flow chart, as shown in Figure 4, repair and above-mentionedly mention data under normal conditions, namely OSD off-line is not had, also do not carry out Data Migration and reparation, and in practice, the off-line of OSD and the migration reparation of data are the indispensable parts of distributed system.To the process when the migration of node abnormal data is repaired be analyzed, emphasis data of description arbitrator transcript module 13 effect wherein below.Distributed storage method based on data arbitrator transcript module 13 also comprises repairs abnormal step below: S1 ', judge whether that there is data trnascription memory module rolls off the production line; S2 ' is if roll off the production line, then the distributed storage decorum gets rid of the data trnascription memory module rolled off the production line, adjustment write operation for data trnascription memory module number; Whether the data trnascription memory module that S3 ', judgement have been rolled off the production line reaches the standard grade again; S4 ', if so, then data trnascription memory module obtains complete operation historical record information from other data trnascription memory modules or data arbitrator transcript module, carries out data restore; S5 ', if not, then the data trnascription memory module do not rolled off the production line continues to carry out data write operation under simplification operational module formula.Distributed storage method based on data arbitrator copy provided by the invention introduces the concept of PGLog at current Ceph, PGLog is safeguarded by PG and have recorded all operations of this PG, general PGLog is recorded as (seqno, object_id).Except comprising the metadata information of object in data arbitrator transcript module 13, also retain this PGLog simultaneously.When an OSD rolls off the production line, relevant PG can be in Degraded, and at this moment PG will reduce the copy number of oneself, this makes it possible to ensure that external data continues write.When the node that rolls off the production line is reached the standard grade again, the node of newly reaching the standard grade can obtain PGLog from other nodes, merges obtain a complete history PGLog according to the PGLog of self and the PGLog of acquisition, then repairs corresponding data according to the relevant information of PGLog.
Refer to Fig. 5 and Fig. 6, be shown as roll off the production line copy repair data detailed step schematic diagram and the copy repair data sequential chart that rolls off the production line, as shown in Figure 5 and Figure 6, OSDA, B, C obtain respectively the PGLog of effect, suppose that OSDC is data arbitrator, so it can not become main OSD.System can select OSD that PGLog is complete as main OSD in the ordinary course of things, during beginning, OSDA is as main OSD, step S4 ' also comprises: S41 ', roll off the production line at moment 1OSDB, the operation historical record information PGLog of the data trnascription memory module traversal rolled off the production line self; Operation historical record information PGLog in the operational history PGLog of S42 ', this data trnascription memory module and data arbitrator transcript module merges, obtain complete operation historical record information PGLog, PGLog record 1 and 2 is obtained because system spare two OSD, OSDA and C that can work on work on.When the moment 2, OSDB reaches the standard grade again, prepares to carry out data and repaiies, and attempts obtaining current up-to-date PGLog sequence number from main OSD (OSDA).Simultaneously at moment 2, OSDA off-line (noting: this might not want synchronization to occur, as long as before OSDB obtains up-to-date sequence number from OSDA).At this moment OSDB can not obtain up-to-date PGLog sequence number from OSDA, so OSDB attempts obtaining PGLog from (OSDC) data president person, obtain up-to-date sequence number 3.OSDB becomes main OSD.According to the vacancy of its PGLog (losing operation 2,3), PG can be set to Incomplete state after OSDB normally works.When the moment 3, OSDA reaches the standard grade again, will obtain up-to-date daily record, and merge from main OSD (OSDA); S43 ', the data trnascription memory module of again reaching the standard grade supplement metadata and the data volume of write loss from other data trnascription memory modules according to the historical operation recorded information merged.Find its data manipulation lacked (4,5,6), by the operating data to OSDB request correspondence, carry out data restore.Simultaneously for OSDB, find that OSDA also can repair the data (2,3) of self losing after reaching the standard grade.Basic premise: system is strong consistency, and read mode is Read-One, so the upper data of main OSD must be correct.
Refer to Fig. 7, be shown as module diagram of the present invention, as shown in Figure 7, the distributed memory system 1 based on data arbitrator copy is installed in server platform, comprising: the first data trnascription memory module 11, second data trnascription memory module 12 and data arbitrator transcript module 13; First data trnascription memory module 11 is for storing metadata to be read, data volume and operation historical record information, the specification of the second data trnascription memory module 12 is identical with the first data trnascription memory module 11, for backing up metadata, data volume and operation historical record information; Data arbitrator transcript module 13 is for backing up metadata and complete operation historical record information.
First authentic copy module 11 comprises: metadata store module 111, history information module 112 and data volume memory module 113; Triplicate module 12 comprises: metadata backup module 121, history information backup module 122 and data volume backup module 123.Data arbitrator transcript module 13 comprises: metadata backup module 131 and history information backup module 132.The distributed memory system of data arbitrator transcript module can expand to distributed storage many live datas center.
Refer to Fig. 8, be shown as data center of the present invention expansion embodiment communication scheme, distributed storage many live datas center comprises data center module 101, data center module 102 and data arbitrator center module 103 as shown in Figure 8.Improve according to data arbitration in invention, can using the many live data center of ceph as a novel distributed storage.Data center 101 is connected by 10,000,000,000 networks of low delay (within 5ms) with in the middle of data center 102, data center 101 is connected with data center 103 respectively with 102, they directly connect time delay can require low spot (within 100ms) a little, can be connected with gigabit networking.Actual data trnascription is deposited by data center 101 and 102, as the data center of dual-active.Data center 103 is as data arbitrator, and when only having but occur exception, auxiliary data is recovered.Such dual-active data center can meet data High Availabitity, and conventional data centers also can be avoided to occur the problem of fissure.
To sum up, what in the read-write flow process of data, the present invention adopted is that Write-Quorum data object write flow process is as elementary tactics, asynchronous mode is adopted by storing the information that data object comprises based on the ceph distributed memory system of data arbitrator copy and method, first the synchronizing information that data storage object comprises is write the first data trnascription memory module and the second data trnascription memory module, again asynchronously by back-up storage in complete operational history write data arbitrator transcript module, all wired the reading from first authentic copy module of all read operations to object data information stores data object information and list information.The client that storage operation occurs first directly communicates with the first data trnascription memory module, and client sends write operation request to the server end of distributed memory system, initiates data object write operation.After first data trnascription memory module receives write operation request, first synchronously send the write operation request corresponding with Backup Data to the second data trnascription memory module, then initiate data object write operation to data arbitrator transcript module asynchronously.After the first data trnascription memory module, the second data trnascription memory module and data arbitrator transcript module complete data object write operation separately, the second data trnascription memory module is sent completely the corresponding confirmation of write to the first data trnascription memory module.The operational history of the first data trnascription memory module and the second data trnascription memory module is complete, and after all data object information write operations complete, server end sends confirmation to client, confirms that data object write operation completes.When there are the abnormal conditions such as going offline appears in a certain data trnascription memory module, do not occur that abnormal data trnascription memory module continues write object data information, realization such after the data trnascription memory module broken down recovers normal operating conditions ensures the reliability in data object write operation process, issuable data loss problem when avoiding causing distributed memory system to run into exception as far as possible, simplify the mechanical complications of system realization and realize difficulty, improve the reading performance of data, solve the affirmation mechanism existed in prior art three replication policy lengthy and jumbled, data object write operation time delay is larger, the problem that entire system average input and output IOPS (Input/OutputPerSecend) per second is not high.
Above-described embodiment is illustrative principle of the present invention and effect thereof only, but not for limiting the present invention.Any person skilled in the art scholar all without prejudice under spirit of the present invention and category, can modify above-described embodiment or changes.Therefore, such as have in art usually know the knowledgeable do not depart from complete under disclosed spirit and technological thought all equivalence modify or change, must be contained by claim of the present invention.

Claims (10)

1. based on the distributed storage method of data arbitrator copy, it is characterized in that, comprise the following steps:
S1, distributed memory system receive data write operation request;
After S2, described distributed memory system receive said write operation requests, start data write operation, and the described data syn-chronization of write is write in data trnascription memory module;
The data write operation of S3, described data trnascription memory module completes, and described data trnascription memory module directly postbacks write operation confirmation;
S4, data arbitrator transcript module continue to write data asynchronously described data trnascription memory module, carry out complete data arbitration information backup, for supported data migration and reparation;
S5, the write operation of described data arbitrator transcript module complete, and postback asynchronous write operation acknowledgement information.
2. the distributed storage method based on data arbitrator copy according to claim 1, it is characterized in that: described data write operation flow process adopts Write-Quorum data write flow process, if described data trnascription memory module adds up to n, after the described data trnascription memory module number then completing data write operation is greater than n/2, directly return write operation confirmation, the data message in residue transcript module writes asynchronously described data trnascription memory module.
3. the distributed storage method based on data arbitrator copy according to claim 1, is characterized in that: described data arbitration information is operation historical record information, for repairing described data trnascription memory module because of the abnormal data do not write; The metadata of data object, data volume and described operation historical record information is all write in described data trnascription memory module; Write operation is not carried out to described data volume in described data arbitrator transcript module.
4. the distributed storage method based on data arbitrator copy according to claim 1, is characterized in that: further comprising the steps of:
S1 ', judge whether that there is data trnascription memory module rolls off the production line;
S2 ', if so, then the described distributed storage decorum gets rid of the data trnascription memory module rolled off the production line, adjustment write operation for data trnascription memory module number;
Whether the described data trnascription memory module that S3 ', judgement have been rolled off the production line reaches the standard grade again;
S4 ', if so, then described data trnascription memory module obtains complete operation historical record information and data volume from data trnascription memory module described in other or data arbitrator transcript module, carries out data restore;
S5 ', if not, then the described data trnascription memory module do not rolled off the production line proceeds data write operation.
5. the distributed storage method based on data arbitrator copy according to claim 4, is characterized in that: described step S4 ' also comprises:
S41 ', the described operation historical record information of described data trnascription memory module traversal self rolled off the production line;
Operation historical record information in S42 ', the described data trnascription memory module that rolls off the production line and the described operation historical record information in described data arbitrator transcript module merge, and obtain complete described operation historical record information;
S43 ', the described data trnascription memory module of again reaching the standard grade supplement the described metadata of write loss and described data volume according to the described historical operation recorded information merged from the described data trnascription memory module that other do not go offline.
6. based on the distributed memory system of data arbitrator copy, data are stored for distribution on line formula, it is characterized in that, the described distributed memory system based on data arbitrator copy is installed in server platform, comprising: the first data trnascription memory module, the second data trnascription memory module and data arbitrator transcript module; Described first data trnascription memory module is for storing metadata to be read, data volume and described operation historical record information, the specification of described second data trnascription memory module is identical with described first data trnascription memory module, for backing up described metadata, data volume and described operation historical record information; Described data arbitrator transcript module is for backing up described metadata and complete described operation historical record information.
7. the distributed memory system based on data arbitrator copy according to claim 5, is characterized in that: described first authentic copy module comprises: Operand data memory module, history information module and data volume memory module; Described triplicate module comprises: Operand data backup module, history information backup module and data volume backup module.
8. the distributed memory system based on data arbitrator copy according to claim 5, is characterized in that: described data arbitrator transcript module comprises: metadata backup module and history information backup module.
9. the distributed memory system based on data arbitrator copy according to claim 5, is characterized in that: the described distributed memory system based on data arbitrator copy can expand to distributed storage many live datas center.
10. the distributed memory system based on data arbitrator copy according to claim 5, is characterized in that: described distributed storage many live datas center comprises data center module and data arbitrator center module.
CN201510955956.1A 2015-12-17 2015-12-17 Distributed storage system and method based on data arbiter copy Pending CN105577776A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510955956.1A CN105577776A (en) 2015-12-17 2015-12-17 Distributed storage system and method based on data arbiter copy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510955956.1A CN105577776A (en) 2015-12-17 2015-12-17 Distributed storage system and method based on data arbiter copy

Publications (1)

Publication Number Publication Date
CN105577776A true CN105577776A (en) 2016-05-11

Family

ID=55887420

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510955956.1A Pending CN105577776A (en) 2015-12-17 2015-12-17 Distributed storage system and method based on data arbiter copy

Country Status (1)

Country Link
CN (1) CN105577776A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106598768A (en) * 2016-11-28 2017-04-26 华为技术有限公司 Write request processing method, device and data center
CN106776142A (en) * 2016-12-23 2017-05-31 深圳市深信服电子科技有限公司 A kind of date storage method and data storage device
CN106951559A (en) * 2017-03-31 2017-07-14 联想(北京)有限公司 Data reconstruction method and electronic equipment in distributed file system
CN107403003A (en) * 2017-07-21 2017-11-28 南京智网云联信息科技有限公司 A kind of distributed copies file referee method
CN107526652A (en) * 2016-06-21 2017-12-29 华为技术有限公司 A kind of method of data synchronization and storage device
CN107870982A (en) * 2017-10-02 2018-04-03 深圳前海微众银行股份有限公司 Data processing method, system and computer-readable recording medium
CN108063787A (en) * 2017-06-26 2018-05-22 杭州沃趣科技股份有限公司 The method that dual-active framework is realized based on distributed consensus state machine
WO2018108158A1 (en) * 2016-12-16 2018-06-21 贵州白山云科技有限公司 Method and device for storing data based on majority, and storage medium and apparatus
CN108572793A (en) * 2017-10-18 2018-09-25 北京金山云网络技术有限公司 Data are written and data reconstruction method, device, electronic equipment and storage medium
CN108958984A (en) * 2018-08-13 2018-12-07 深圳市证通电子股份有限公司 Dual-active based on CEPH synchronizes online hot spare method
CN109992219A (en) * 2019-04-11 2019-07-09 深信服科技股份有限公司 Distributed storage method, device, equipment and computer readable storage medium
CN107229535B (en) * 2017-05-23 2020-01-21 杭州宏杉科技股份有限公司 Multi-copy storage method, storage device and data reading method for data block
CN111290699A (en) * 2018-12-07 2020-06-16 杭州海康威视系统技术有限公司 Data migration method, device and system
CN111831674A (en) * 2020-06-29 2020-10-27 山大地纬软件股份有限公司 Block chain node, system and digital data copy distribution method
CN112039436A (en) * 2020-09-03 2020-12-04 成都易联智通信息技术有限公司 Method for analyzing power station state by integrating photovoltaic inverter working state and real-time data
CN112468601A (en) * 2021-02-03 2021-03-09 柏科数据技术(深圳)股份有限公司 Data synchronization method, access method and system of distributed storage system
CN114625325A (en) * 2022-05-16 2022-06-14 阿里云计算有限公司 Distributed storage system and storage node offline processing method thereof
CN116048424A (en) * 2023-03-07 2023-05-02 浪潮电子信息产业股份有限公司 IO data processing method, device, equipment and medium
CN116149558A (en) * 2023-02-21 2023-05-23 北京志凌海纳科技有限公司 Copy allocation strategy system and method in distributed storage dual-active mode

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102882927A (en) * 2012-08-29 2013-01-16 华南理工大学 Cloud storage data synchronizing framework and implementing method thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102882927A (en) * 2012-08-29 2013-01-16 华南理工大学 Cloud storage data synchronizing framework and implementing method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
翟永东: "Hadoop分布式文件系统(HDFS)可靠性的研究与优化", 《中国优秀硕士学位论文全文数据库,信息科技辑》 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526652B (en) * 2016-06-21 2021-08-20 华为技术有限公司 Data synchronization method and storage device
CN107526652A (en) * 2016-06-21 2017-12-29 华为技术有限公司 A kind of method of data synchronization and storage device
CN106598768A (en) * 2016-11-28 2017-04-26 华为技术有限公司 Write request processing method, device and data center
CN106598768B (en) * 2016-11-28 2020-02-14 华为技术有限公司 Method and device for processing write request and data center
CN108206839B (en) * 2016-12-16 2020-02-07 贵州白山云科技股份有限公司 Data storage method, device and system based on majority
WO2018108158A1 (en) * 2016-12-16 2018-06-21 贵州白山云科技有限公司 Method and device for storing data based on majority, and storage medium and apparatus
CN108206839A (en) * 2016-12-16 2018-06-26 贵州白山云科技有限公司 One kind is based on majority's date storage method, apparatus and system
CN106776142A (en) * 2016-12-23 2017-05-31 深圳市深信服电子科技有限公司 A kind of date storage method and data storage device
CN106776142B (en) * 2016-12-23 2020-09-01 深信服科技股份有限公司 Data storage method and data storage device
CN106951559B (en) * 2017-03-31 2020-08-25 联想(北京)有限公司 Data recovery method in distributed file system and electronic equipment
CN106951559A (en) * 2017-03-31 2017-07-14 联想(北京)有限公司 Data reconstruction method and electronic equipment in distributed file system
CN107229535B (en) * 2017-05-23 2020-01-21 杭州宏杉科技股份有限公司 Multi-copy storage method, storage device and data reading method for data block
CN108063787A (en) * 2017-06-26 2018-05-22 杭州沃趣科技股份有限公司 The method that dual-active framework is realized based on distributed consensus state machine
CN107403003A (en) * 2017-07-21 2017-11-28 南京智网云联信息科技有限公司 A kind of distributed copies file referee method
CN107870982A (en) * 2017-10-02 2018-04-03 深圳前海微众银行股份有限公司 Data processing method, system and computer-readable recording medium
CN107870982B (en) * 2017-10-02 2021-04-23 深圳前海微众银行股份有限公司 Data processing method, system and computer readable storage medium
CN108572793A (en) * 2017-10-18 2018-09-25 北京金山云网络技术有限公司 Data are written and data reconstruction method, device, electronic equipment and storage medium
CN108572793B (en) * 2017-10-18 2021-09-10 北京金山云网络技术有限公司 Data writing and data recovery method and device, electronic equipment and storage medium
CN108958984A (en) * 2018-08-13 2018-12-07 深圳市证通电子股份有限公司 Dual-active based on CEPH synchronizes online hot spare method
CN111290699A (en) * 2018-12-07 2020-06-16 杭州海康威视系统技术有限公司 Data migration method, device and system
CN111290699B (en) * 2018-12-07 2023-03-14 杭州海康威视系统技术有限公司 Data migration method, device and system
CN109992219A (en) * 2019-04-11 2019-07-09 深信服科技股份有限公司 Distributed storage method, device, equipment and computer readable storage medium
CN111831674A (en) * 2020-06-29 2020-10-27 山大地纬软件股份有限公司 Block chain node, system and digital data copy distribution method
CN112039436A (en) * 2020-09-03 2020-12-04 成都易联智通信息技术有限公司 Method for analyzing power station state by integrating photovoltaic inverter working state and real-time data
CN112039436B (en) * 2020-09-03 2023-11-17 苏州奥维斯数字技术有限公司 Method for analyzing power station state by integrating working state and real-time data of photovoltaic inverter
CN112468601A (en) * 2021-02-03 2021-03-09 柏科数据技术(深圳)股份有限公司 Data synchronization method, access method and system of distributed storage system
CN112468601B (en) * 2021-02-03 2021-05-18 柏科数据技术(深圳)股份有限公司 Data synchronization method, access method and system of distributed storage system
CN114625325A (en) * 2022-05-16 2022-06-14 阿里云计算有限公司 Distributed storage system and storage node offline processing method thereof
CN116149558A (en) * 2023-02-21 2023-05-23 北京志凌海纳科技有限公司 Copy allocation strategy system and method in distributed storage dual-active mode
CN116149558B (en) * 2023-02-21 2023-10-27 北京志凌海纳科技有限公司 Copy allocation strategy system and method in distributed storage dual-active mode
CN116048424A (en) * 2023-03-07 2023-05-02 浪潮电子信息产业股份有限公司 IO data processing method, device, equipment and medium
CN116048424B (en) * 2023-03-07 2023-06-06 浪潮电子信息产业股份有限公司 IO data processing method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN105577776A (en) Distributed storage system and method based on data arbiter copy
CN101051286B (en) Storage system
CN102891849B (en) Service data synchronization method, data recovery method, data recovery device and network device
EP2120147B1 (en) Data mirroring system using journal data
CN103593266B (en) A kind of double hot standby method based on arbitration disk mechanism
CA2376242C (en) Improved remote data copy using a prospective suspend command
US20070276884A1 (en) Method and apparatus for managing backup data and journal
US20010049749A1 (en) Method and system for storing duplicate data
CN106201338A (en) Date storage method and device
EP0723223A2 (en) Identifying controller pairs in a dual controller disk array
CN102710752B (en) Calamity is for storage system
CN1983153A (en) Method for carrying long-distance copy in data processing system and method of storing data
US6073221A (en) Synchronization of shared data stores through use of non-empty track copy procedure
CN105049258B (en) The data transmission method of network disaster tolerance system
CN103544057A (en) Switching method and switching system for data service systems
CN110058787A (en) For the method, equipment and computer program product of data to be written
JP2001134487A (en) Disk array device
CN106951456B (en) Memory database system and data processing system
CN103092778A (en) Cache mirroring method for memory system
CN103544081B (en) The management method of double base data server and device
US8527723B1 (en) Storage system and control method for storage system
CN105574026A (en) Method and device for service supporting by using non-relational database
CN104484354B (en) Ensure the Snapshot Method and storage device of data consistency
CN106980556A (en) A kind of method and device of data backup
CN103761156B (en) A kind of online restorative procedure for file system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20200103

AD01 Patent right deemed abandoned