WO2015186191A1 - データ管理システム及びデータ管理方法 - Google Patents
データ管理システム及びデータ管理方法 Download PDFInfo
- Publication number
- WO2015186191A1 WO2015186191A1 PCT/JP2014/064739 JP2014064739W WO2015186191A1 WO 2015186191 A1 WO2015186191 A1 WO 2015186191A1 JP 2014064739 W JP2014064739 W JP 2014064739W WO 2015186191 A1 WO2015186191 A1 WO 2015186191A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- determination unit
- unit
- consistency
- determination
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/18—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
- G06F11/186—Passive fault masking when reading multiple copies of the same data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/18—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
- G06F11/187—Voting techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2048—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant where the redundant components share neither address space nor persistent storage
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2097—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
Definitions
- the present invention relates to a computer system that stores and multiplexes data in a plurality of servers to ensure availability.
- Patent Document 1 In a multiplexing method in which data is stored in a plurality of servers, a technique using a distributed agreement algorithm (for example, Patent Document 1) is known to ensure data consistency.
- PAXOS Paxos algorithm
- original data is stored as a master in a master computer
- duplicated data is treated as a slave, and stored in a plurality of slave computers.
- Number of processes n 2f + 1 Represented as: From the above, at least two communications are required between the computers (master and slave), and the allowable failure number e is less than n / 2.
- the allowable failure number e is the number of processes (or computers) that can maintain the minimum number of communications (latency) even if a failure occurs.
- the latency is the minimum number of communication ⁇ from when the client requests data update (or reference) to the master computer until the slave computer reaches an agreement (guarantees data consistency).
- Non-Patent Documents 1 and 2 A technique for ensuring latency while ensuring consistency has been proposed (for example, Non-Patent Documents 1 and 2).
- the present invention has been made in view of the above-described problems. While suppressing an increase in the number of servers (or the number of processes), requesting the server to update (or referencing) data, data consistency can be achieved.
- the purpose is to reduce the number of communications until guaranteeing.
- the present invention is a data management system having a plurality of servers each including a processor, a memory, and a storage device, receiving and storing data by the plurality of servers, and multiplexing and holding the data.
- a first determination unit that determines consistency of the multiplexed data, and an allowable fault of the server more than the first determination unit when determining consistency of the multiplexed data.
- the second determination unit having a large number but a large number of minimum communications between the servers to determine the consistency of the data, and the consistency of the data from the first determination unit or the second determination unit
- the combination unit that outputs the data that guarantees the consistency
- the data output by the combination unit Comprising a data storage unit for pay, a.
- an increase in the number of servers (number of processes) is suppressed, and a decrease in the number of allowable faults of the server is suppressed. Can do.
- FIG. 1 is a block diagram illustrating an example of a computer system that performs distributed data management according to a first embodiment of this invention.
- FIG. It is a block diagram which shows a 1st Example of this invention and shows an example of a server. It is a sequence diagram which shows a 1st Example of this invention and shows an example of the distributed data management performed with a server. It is a figure which shows the 1st Example of this invention and shows the priority of algorithm selection. It is a figure which shows the 1st Example of this invention and compares the performance of each algorithm. It is a flowchart which shows a 1st Example of this invention and shows an example of the process performed by each server.
- FIG.6 S4 It is a flowchart which shows a 1st Example of this invention and shows an example of the process-saving 1 step agreement process performed by FIG.6 S4. It is a flowchart which shows a 1st Example of this invention and shows an example of 2 step agreement processing performed by FIG.6 S5. It is a flowchart which shows a 1st Example of this invention and shows an example of the combination process performed by FIG.6 S6. It is a flowchart which shows the 2nd Example of this invention and shows an example of the combination process by all the orders. It is a block diagram which shows the 3rd Example of this invention and shows an example of the transmission / reception part of each server by a partial order.
- FIG. 1 is a block diagram showing an example of a computer system that performs distributed data management.
- Servers 1-1 to 1-n are connected to clients 3-1, 3-2 via network 2.
- the servers 1-1 to 1-n constitute a distributed data management system that distributes and stores data received from the clients 3-1, 3-2.
- the generic name of the servers 1-1 to 1-n is represented by reference numeral 1
- the generic term of the clients 3-1 and 3-2 is represented by reference numeral 3.
- FIG. 2 is a block diagram showing an example of the configuration of the server 1-1. Since the servers 1-2 to 1-n have the same configuration, a duplicate description is omitted.
- the server 1-1 includes a computer that includes a processor 11 that performs operations, a memory 12 that stores programs and data, a storage device 14 that stores data and programs, and an interface 13 that is connected to the network 2 and performs communication. It is.
- the memory 12 outputs a transmission / reception unit 110 that transmits and receives data via the interface 13, an update unit 130 that determines the identity (consistency) of the received data and the data of another server 1, and an update unit 130.
- a data storage unit 140 for storing the data is stored.
- the data storage unit 140 may be set in the storage device 14 or may be set in both the storage device 14 and the memory 12.
- the updating unit 130 executes a two-step agreement algorithm and a two-step agreement algorithm, and a saving process one-step agreement unit 210 that executes a saving process one-step agreement algorithm to determine the identity of the received data and the data of another server 1.
- a low-latency agreement algorithm execution unit 200 including a two-step agreement unit 220, a PAXOS agreement unit 230 that executes a PAXOS algorithm as an auxiliary agreement algorithm, and outputs of the reduced process one-step agreement unit 210 and the two-step agreement unit 220 are combined to output a deterministic value for which consistency is guaranteed.
- the functional units of the transmission / reception unit 110, the process-saving 1-step agreement unit 210, the 2-step agreement unit 220, the PAXOS agreement unit 230, and the combination unit 240 constituting the update unit 130 are loaded into the memory 12 as programs.
- the processor 11 operates as a functional unit that provides a predetermined function by processing according to a program of each functional unit.
- the processor 11 functions as the transmission / reception unit 110 by processing according to the transmission / reception program, and functions as the combination unit 240 by processing according to the combination program.
- the processor 11 also operates as a functional unit that provides each function of a plurality of processes executed by each program.
- a computer and a computer system are an apparatus and a system including these functional units.
- Information such as programs and tables for realizing each function of the server 1-1 is stored in a storage device 14, a nonvolatile semiconductor memory, a hard disk drive, a storage device such as an SSD (Solid State Drive), or an IC card, SD card, DVD Etc., and can be stored in a computer readable non-transitory data storage medium.
- a storage device 14 a nonvolatile semiconductor memory, a hard disk drive, a storage device such as an SSD (Solid State Drive), or an IC card, SD card, DVD Etc.
- the transmitting / receiving unit 110 transmits the data received by the server 1-1 to the other servers 1-2 to 1-n. Thereafter, the transmission / reception unit 110 receives the update request data transmitted from half of all the servers 1-2 to 1-n, transmits the received data to the update unit 130, and the update unit 130 determines the identity of the data. Judgment is made.
- the update unit 130 outputs data that is guaranteed to be consistent with the data of other servers 1 and writes the data in the data storage unit 140.
- the update unit 130 transmits an update completion response to the client 3 and completes the update process.
- the configuration of the client 3 is a computer including a processor, a memory, an interface, and a storage device (not shown).
- the client 3 executes an application that requests the server 1 to register, update, and reference data.
- the configuration of the update unit 130 will be described. As described above, the data received by the transmission / reception unit 110 from the client 3 and the data of the client 3 received by another server 1 are input to the update unit 130.
- the update unit 130 inputs these data to the process-saving 1-step agreement unit 210 (first determination unit) and the 2-step agreement unit 220 (second determination unit), respectively, and the identity (or consistency) of the data. ) Is executed.
- the process-saving one-step agreement unit 210 outputs a definite value, an estimated value, or a resolved value as an identity determination result.
- the two-step agreement unit 220 outputs a definite value, an estimated value, or a resolved value as the identity determination result.
- the combination unit 240 inputs these fixed values, estimated values, or solution values.
- a definite value (agreement value) that guarantees consistency for the input data.
- the definite value is data in which identity (or consistency) is guaranteed, and the agreed value is data agreed with other servers 1 for identity, and consistency is guaranteed between the servers 1.
- the combination unit 240 inputs the estimated value or the data (solution value) received from the client 3 to the PAXOS agreement unit 230 (third determination unit or auxiliary agreement unit), and other servers
- the PAXOS agreement unit 230 of 1 is communicated to calculate a definite value.
- the update unit 130 acquires the confirmed value of the PAXOS agreement unit 230 and outputs the confirmed value as the agreed value.
- the update unit 130 stores the agreement value output from the combination unit 240 or the PAXOS agreement unit 230 in the data storage unit 140, and transmits a response indicating completion of the data update to the client 3-1.
- FIG. 15 is a block diagram illustrating an example of the process-saving one-step agreement unit 210.
- the determination of data identity (consistency) by the process-saving one-step agreement algorithm and the solution method when the identity is not guaranteed (hereinafter referred to as collision) are the same as the one-step agreement algorithm of Non-Patent Document 1. It is as follows.
- ⁇ Identity of data Definite values always match.
- Collision resolution If there is a possibility that a definite value exists in any one of the servers 1, the estimated value always matches the definite value.
- the transmission / reception units 110 of the servers 1-1 to 1-5 in FIG. 1 are the transmission / reception units 1 to 5, and the update units 130 of the servers 1-1 and 1-5 are the update units 1 and 5.
- the transmission / reception units 1 to 4 receive data “A” from the client 3
- the transmission / reception unit 5 receives data “B” from the client 3.
- the saving process 1-step agreement unit 210 that executes the saving process 1-step agreement algorithm divides the transmission / reception units 1 to 5 into two quorums, a definite quorum and an estimated quorum, and the update unit 1 and the update unit An example of seeking with the agreement of 5 is shown. Note that quorum indicates a subset of elements that execute distributed processing.
- Qe be the number of transmission / reception units 110 (server 1) constituting the definite quorum
- Qf be the number of transmission / reception units 110 (server 1) constituting the estimated quorum.
- n is the total number of transmission / reception units 110 (server 1).
- an arbitrary deterministic quorum needs to include a majority of any estimated quorum, and Qe + (Qf / 2)> n.
- the quorum size that satisfies this is Minimum integer Qe such that Qe ⁇ (3/4) n Smallest integer Qf such that Qf> n / 2 It becomes.
- the majority is the smallest integer greater than n / 2.
- the determination method of the definite value and the estimated value is as follows. -If all data values match within the definite quorum, the value is output as the deterministic value; otherwise, empty is output as the definite value. • If it cannot be determined that a definite quorum exists, nothing is output as the definite value. -If the value of the data of a majority part or more matches within the estimated quorum, the value is output to the estimated value, otherwise, the sky is output to the estimated value.
- the method of determining the solution value when a collision occurs is the same as the one-step consensus algorithm and is as follows. If the estimated value is not empty, the estimated value is output as a resolved value, and if it is empty, any one of the input data is set as the resolved value. It should be noted that the identity can be ensured by executing a PAXOS agreement with the other update unit 130 for the solution value and the estimated value.
- the quorum process including the transmission / reception unit 1 is executed by the update unit 1 of the server 1, but the process may be executed by the update unit n of another server 1 -n.
- FIG. 16 is a block diagram illustrating an example of the two-step agreement unit 220.
- the data identity determination by the two-step agreement algorithm and the collision resolution method in which the identity is not guaranteed are the same as the above-described process-saving one-step agreement algorithm (and the one-step agreement algorithm).
- the transmission / reception units 110 of the servers 1-1 to 1-5 in FIG. 1 are the transmission / reception units 1 to 5, and the update units 130 of the servers 1-1 and 1-5 are the update units 1 and 5.
- the two-step agreement unit 220 shown in FIG. 1 is divided into a front-stage unit 220-A and a rear-stage unit 220-B, and the front-stage unit 220-A of the servers 1-1 to 1-5 is divided into the front-stage unit 1 ⁇ 5, the server 1-1 and the rear stage 220-B of the server 1-5 are displayed as the rear stage 1 and the rear stage 5.
- FIG. 16 shows an example in which the consensus value is determined by the rear stage unit 1 of the server 1-1 and the rear stage unit 5 of the server 1-5.
- the transmission / reception units 1 to 4 receive data “A” from the client 3, and the transmission / reception unit 5 receives data “B” from the client 3.
- the transmission / reception units 1 to 5 are divided into a plurality of selected quorums, the pre-stage units 1 to 5 are divided into two count quorums, and the update units 1 and 5 perform processing.
- An example is shown. Although illustration is omitted, the input of the front stage unit 2 is the data received by the selected quorum of the transmission / reception units 2, 3, 4 and the input of the front stage unit 3 is the data received by the transmission / reception units 1, 3, 4 The input of the pre-stage unit 4 is data received by the transmission / reception units 1, 2 and 4.
- the selected quorum of the transmission / reception units 1 to 5 and the counting quorum of the front stage units 1 to 5 are set as follows.
- any selected quorums must overlap each other, and more than half are required. Therefore, the size of the selected quorum is the smallest integer exceeding n / 2.
- the size of the counting quorum is the smallest integer exceeding n / 2.
- the first-stage units 1 to 5 of the two-step agreement unit 220 transmit the received data to the first-stage units 1 to 5 for each selected quorum.
- the selection value is empty.
- the determination method of the definite value and the estimated value in the rear stage portions 1 and 5 is as follows. -If all the data matches within the counting quorum, the value is output as the final value. If there is non-empty data in the counting quorum, that value is used as an estimated value, otherwise, empty is output as an estimated value.
- the solution value determination method is the same as in the one-step agreement. If the estimated value is not empty, it is an estimated value, and if it is empty, it is an arbitrary one of the input data. It should be noted that the identity can be ensured by executing a PAXOS agreement with the other update unit 130 for the solution value and the estimated value.
- FIG. 3 is a sequence diagram showing an example of distributed data management performed by the server 1 of the present invention.
- the client 3-1 transmits an update request for data A to the servers 1-1 to 1-n by multicast.
- the multicast transmission from the client 3-1 to each server 1 may be performed by a management computer (not shown).
- the transmission / reception unit 110 of each server 1 transmits the data received from the client 3-1 to the other server 1 by multicast (M2 in the figure).
- the server 1 transmits data received from half of all the servers 1 to the update unit 130, and performs identity determination.
- the update unit 130 inputs the definite quorum (or estimated quorum) data shown in FIG. 15 to the process-saving one-step agreement unit (SP1-STEP in FIG. 3) 210. Further, the updating unit 130 inputs the data of the selected quorum (or counting quorum) shown in FIG.
- the process-saving one-step agreement unit 210 when all 3/4 of the input data match (when all the data input from the fixed quorum match), the data is determined as a fixed value (determined value). . Further, the saving process 1-step agreement unit 210 divides the data when half of the input data matches the majority of the data, that is, when the majority of the data input from the estimated quorum matches. Estimated value. On the other hand, in cases other than the above, an arbitrary input value such as data received from the client 3 is output as a solution value.
- the two-step consensus unit 220 calculates a final value, an estimated value, or a solution value from the data of the quorum selected by the front-stage unit 220-A (2-STEP (1) in FIG. 3), and the post-stage unit 220 for each count quorum.
- -B (2-STEP (2) in FIG. 3) is transmitted (M3).
- the rear stage 220-B when all the data input from the counting quorum match, the data is output as a definite value. Further, when the data input from the counting quorum partially matches, the rear stage unit 220-B uses the data as an estimated value. On the other hand, in cases other than the above, an arbitrary input value such as data received from the client 3 is output as a solution value.
- FIG. 4 is a diagram showing a priority order for selecting an agreement algorithm.
- one entry is composed of the priority 2401 and the feature 2402 describing the contents to be selected. This priority order is held in the condition setting unit 120.
- the combination unit 240 selects, as the first priority, either the final value of the reduced process 1-step agreement unit 210 or the final value of the 2-step agreement unit 220 as an agreed value (agreement final value).
- the fixed value of the process-saving one-step agreement unit 210 is a value obtained by matching the data received from 3/4 (determined quorum) of the server 1.
- the final value of the two-step agreement unit 220 is a value obtained by matching the data received by the latter stage unit 220-B from 1/2 (count quorum) of the server 1.
- the combination unit 240 selects the estimated value of the two-step agreement unit 220 as the second priority.
- This estimated value is a partially matched value (a partially matched value among the data received from the counting quorum) among the data received from the half of the server 1 by the subsequent stage unit 220-B of the 2-step agreement unit 220.
- the combination unit 240 selects the estimated value of the process-saving one-step agreement unit 210 as the third priority.
- This estimated value is a value in which the majority of the data received from 1/2 of the server 1 by the process-saving one-step consensus unit 210 (a value in which the majority of the data received from the estimated quorum is matched).
- the combination unit 240 selects one of the solution values of the reduced process 1-step agreement unit 210 or the 2-step agreement unit 220 as the fourth priority.
- the combination unit 240 selects the output of the process-saving 1-step agreement unit 210 or the 2-step agreement unit 220 in accordance with the priority order shown in FIG. 4, and then outputs the selected data as it is if it is a definite value.
- the combination unit 240 selects an estimated value or a resolved value, it is necessary to make an agreement with another server 1. For this reason, the combination unit 240 inputs the selected estimated value or resolution value to the PAXOS agreement unit 230 and determines the agreed value of data with the other server 1.
- the updating unit 130 stores the agreement value output from the combination unit 240 or the agreement value output from the PAXOS agreement unit 230 in the data storage unit 140 as data whose identity is guaranteed with each server 1. To do. In addition, the update unit 130 responds to the client 3-1 that data storage is completed.
- the two consensus algorithms of the process-saving one-step consensus unit 210 and the two-step consensus unit 220 are combined, and further, the PAXOS consensus unit 230 agrees when a definite value cannot be obtained from these two consensus algorithms. Get the value.
- the minimum number of communication times from when the client 3 requests the server 1 to update (or refer to) data until the agreement is reached at the server 1 is reduced. Can be reduced.
- FIG. 5 is a diagram for comparing the performance of each consensus algorithm.
- the name 3001 of the agreement algorithm the number of processes n3002 indicating the number of computers for storing data necessary to guarantee data consistency, and the minimum until an agreement is obtained between the client 3 and the server 1
- One entry is composed of the allowable failure number e3003 capable of maintaining the communication frequency and the minimum communication frequency ⁇ 3004 until the client 3 agrees with the server 1.
- the required number of processes 3002 is f in which the number of data copies for ensuring data consistency is f, and the number of computers storing data is the number n of processes.
- the allowable failure number e3003 that can maintain the minimum number of communication ⁇ is the highest in availability because PAXOS and two steps are the maximum, e ⁇ n / 2.
- the process-saving 1 step has the lowest availability with e ⁇ n / 4 and the lowest.
- the allowable failure number e3003 in one step is larger than the one step in process saving, but is smaller than the allowable failure number e in PAXOS and two steps.
- this is a one-time communication in response to a data update request from the client 3-1, and the transmission / reception unit 110 of each server 1 transmits the received data to the other server 1 twice. (M2).
- the agreement value between the servers 1 can be obtained by a total of two communications.
- the required number of processes n3002 is less than 1 step
- n 2f + 1 equivalent to PAXOS
- the allowable failure number e is E ⁇ n / 2, which is equivalent to PAXOS
- the minimum communication count ⁇ is smaller than PAXOS, which is two times equivalent to one step.
- the allowable number of failures e equivalent to PAXOS can be maintained while reducing the computer resources (number of processes n) compared to one step, and further, the minimum number of communications ⁇ equivalent to one step can be secured. .
- FIG. 6 is a flowchart illustrating an example of processing performed in each server. This process is executed when a data update request (or reference request or registration request) is received from the client 3.
- the server 1 receives data included in the update request transmitted by multicast from the client 3 (S1).
- the server 1 transmits the received data to the other server 1 by multicast (S2).
- the server 1 receives the data transmitted by the client 3 from half of all the servers 1 (S3).
- step S4 the server 1 inputs the data received from half of all the servers 1 received in step S3 to the reduced process 1-step agreement process in step S4 and the 2-step agreement process in step S5, respectively.
- step S4 the process of the above-described reduced process 1 step agreement unit 210 is executed as shown in FIG.
- step S5 the above-described processing of the two-step agreement unit 220 is executed as shown in FIG.
- the example of illustration shows the example which performs a process-saving 1 step agreement process (S4) and a 2 step agreement process (S5) in parallel, you may perform sequentially.
- step S6 the processing of the combination unit 240 described above is executed as shown in FIG.
- step S ⁇ b> 7 the update unit 130 of the server 1 determines whether or not an agreement is confirmed by outputting a confirmed value from either the saving process 1-step agreement process or the 2-step agreement process. If the agreement is confirmed, the process proceeds to step S8, and the confirmed value output by the combination unit 240 is determined as the agreed value.
- step S7 the update unit 130 proceeds to step S9, inputs the estimated value or the solution value output from the combination unit 240 to the PAXOS agreement unit 230, and Execute the agreement process.
- step S ⁇ b> 10 the update unit 130 receives an output from the PAXOS agreement unit 230 and determines it as an agreement value.
- the update unit 130 stores the agreed value determined in step S8 or step S10 in the data storage unit 140 and responds to the client 3 that the update is completed.
- FIG. 7 is a flowchart showing an example of a process-saving one-step agreement process performed in step S4 of FIG. This process is executed by the process-saving one-step agreement unit 210 of the update unit 130.
- the update unit 130 waits until data is received from 1/2 of all servers 1 (estimated quorum) (S11). When the update unit 130 receives data from 1 ⁇ 2 of all the servers 1 constituting the estimated quorum, the update unit 130 determines whether or not these data all match (S12). If all the data match, the process proceeds to step S13, and if not, the process proceeds to step S16.
- step S13 the update unit 130 waits until data is received from 3/4 units (determined quorum) of all servers 1.
- the update unit 130 determines whether or not these data all match (S14). If all the data match, the process proceeds to step S15, and if not, the process proceeds to step S17.
- step S15 since all data of the estimated quorum and all data of the confirmed quorum match, the update unit 130 determines the received data as a confirmed value.
- step S16 in the case where all the data do not match in the determination in step S12, the update unit 130 determines whether or not the majority of the data matches the data received by the server 1. If the majority of the data match, the process proceeds to step S17, and if not, the process proceeds to step S18.
- step S17 the update unit 130 determines data having a majority match as an estimated value. On the other hand, if a majority of the data does not match in the estimated quorum, the updating unit 130 determines the data as a preset solution value in step S18. As the solution value, data received by the server 1 can be used.
- step S19 the updating unit 130 outputs the data determined in any of the above steps S15, S17, and S18 to the combination unit 240.
- the updating unit 130 can obtain a definite value or an estimated value based on the data received by the estimated quorum and the deterministic quorum according to the process-saving one-step consensus algorithm.
- FIG. 8 is a flowchart illustrating an example of the two-step agreement process performed in step S5 of FIG. This process is executed by the two-step agreement unit 220 of the update unit 130.
- the updating unit 130 waits until the preceding unit 220-A receives data from 1/2 of all servers 1 (selected quorum) (S21). When the update unit 130 receives data from 1 ⁇ 2 of all the servers 1 constituting the selected quorum, the update unit 130 determines whether the data received by the server 1 matches all of these data (S22). If all the data match, the process proceeds to step S23, and if not, the process proceeds to step S24.
- step S23 the data received from the other server 1 by the front stage unit 220-A is transmitted to the rear stage unit 220-B of each server 1.
- step S24 in which all the data of the selected quorum do not match, the pre-stage unit 220-A transmits empty data to the post-stage unit 220-B of each server 1.
- step S25 the process waits until the rear stage unit 220-B receives data from 1/2 of all servers 1 (counting quorum) (S25).
- the update unit 130 determines whether or not the data received by the server 1 matches all these data (S26). If all the data match, the process proceeds to step S27, and if not, the process proceeds to step S28.
- step S28 it is determined whether all the data received by the post-stage unit 220-B is empty data. If all the data received by the post-stage unit 220-B is not empty, the updating unit 130 proceeds to step S29 and determines any data as an estimated value.
- step S30 the process proceeds to step S30, and the data is determined as a preset solution value.
- the solution value data received by the server 1 can be used.
- step S31 the update unit 130 outputs the data determined in any of the above steps S27, S29, and S30 to the combination unit 240.
- the updating unit 130 can obtain a definite value or an estimated value based on the data received by the selected quorum and the counting quorum by the two-step agreement algorithm.
- FIG. 9 is a flowchart showing an example of processing of the combination unit 240 performed in step S6 of FIG.
- the combination unit 240 receives the output of the process-saving 1-step agreement process (S4) and the 2-step agreement process (S5) (S41).
- the combination unit 240 determines whether there is a definite value in either the output of the process-saving 1-step agreement process or the output of the 2-step agreement process (S5) (S42). If a definite value exists in any output, the process proceeds to step S43, and if not, the process proceeds to step S45.
- step S43 the combination unit 240 selects a fixed value from the output of either the process-saving 1-step agreement process or the 2-step agreement process.
- step S44 the determined value selected by the combination unit 240 is set as the agreed value.
- step S45 the combination unit 240 determines whether or not an estimated value exists in the output of the two-step agreement process. If an estimated value exists, the process proceeds to step S46, and if not, the process proceeds to step S47.
- step S46 the combination unit 240 selects the estimated value of the two-step consensus process and proceeds to step S50.
- step S47 the combination unit 240 determines whether or not an estimated value exists in the output of the process-saving one-step agreement (S1-STEP in the drawing) process. If the estimated value exists, the process proceeds to step S48, and if not, the process proceeds to step S49.
- step S48 the combination unit 240 selects the estimated value of the process-saving one-step consensus process and proceeds to step S50.
- step S49 the combination unit 240 selects a solution value and proceeds to step S50.
- the solution value a value set in advance by the combination unit 240 or a value received by the transmission / reception unit 110 may be used.
- step S50 the value selected by the combination unit 240 in any of steps S46, S45, and S49 is input to the PAXOS agreement unit 230.
- step S51 the PAXOS agreement unit 230 calculates and outputs an agreement value with another server 1.
- the PAXOS consensus unit 230 is the same as that of the above-mentioned prior art example 1 and will not be described in detail here.
- step S52 the combination unit 240 sets the determined value of the PAXOS agreement unit 230 as the agreed value.
- step S53 the combination unit 240 outputs the agreed value obtained in any one of steps S44 and S52.
- the combination unit 240 of the updating unit 130 can set the fixed value of either the process-saving 1-step agreement unit 210 or the 2-step agreement unit 220 as the agreement value.
- the agreement value can be obtained from the PAXOS agreement unit 230 with the estimated value or the solution value as an input.
- the required number of processes n is smaller than the one-step consensus algorithm and can be set to a value equivalent to PAXOS.
- the allowable failure number e of the present invention maintains e ⁇ n / 2 that is equivalent to that of PAXOS, and the minimum communication count ⁇ is smaller than that of PAXOS, which is twice that of the one-step agreement algorithm.
- the PAXOS agreement unit 230 calculates the agreement value, but the PAXOS agreement unit 230 frequently executes the agreement value. As a result, the latency increases.
- the update unit 130 executes the process-saving 1-step agreement unit 210 and the 2-step agreement unit 220.
- the agreement value may be calculated only by the PAXOS agreement unit 230.
- the update unit 130 may stop the execution of the saving process 1-step agreement unit 210 and the 2-step agreement unit 220, or input data to the saving process 1-step agreement unit 210 and the 2-step agreement unit 220. You may stop.
- FIG. 10 is a flowchart illustrating an example of processing performed by the combination unit 240 when data from the client 3 is processed in all orders according to the second embodiment.
- the second embodiment is obtained by changing the processing of the combination unit 240 of the first embodiment, and other configurations are the same as those of the first embodiment.
- the whole order indicates that the data received by each transmission / reception unit 110 is compared each time.
- data with higher priority is available, data with lower priority is discarded, and data consistency is guaranteed by using only data with higher priority.
- the process of FIG. 10 is executed by the process of the combination unit in step S6 shown in FIG. 6 of the first embodiment.
- step S61 the combination unit 240 waits until an output result of either the process-saving 1-step agreement process (S1-STEP in the figure) or the 2-step agreement process (2-STEP in the figure) is input.
- the combination unit 240 determines whether the output result is a definite value and an output of the process-saving one-step consensus process with a high priority (S62). The combination unit 240 determines that the agreement has been confirmed if the value is a confirmed value from the high-priority reduced-process one-step agreement process, and proceeds to step S70. On the other hand, the combination unit 240 proceeds to step S63 when it is other than a fixed value such as a two-step agreement having a low priority order or an estimated value, and waits until the other output result is input (S63). .
- the combination part 240 will determine whether it is a definite value, if the other output result is received (S64). If the received output result is a definite value, since the deterministic value is input from each of the two-step consensus process and the saving process one-step consensus process (first priority order), the combination unit 240 determines that the consensus has been confirmed. Determine and proceed to step S70. In step S70, the combination unit 240 outputs the confirmed value for which the agreement has been confirmed in steps S62 and S64 as the agreed value. On the other hand, the combination part 240 progresses to step S65, when the input output result is not a definite value.
- step S65 the combination unit 240 refers to the output result of the two-step agreement process, and determines in step S66 whether the value is an estimated value. If the output result is the estimated value of the two-step consensus process having the second priority, the combination unit 240 proceeds to step S71 and inputs the estimated value to the PAXOS consensus unit 230. On the other hand, when the output result is not the estimated value of the two-step agreement process, the process proceeds to step S67.
- step S67 the combination unit 240 refers to the output result of the process-saving one-step consensus process, and determines in step S68 whether the value is an estimated value. If the output result is the estimated value of the process-saving one-step consensus process with the third priority, the combination unit 240 proceeds to step S71 and inputs the estimated value to the PAXOS consensus unit 230. On the other hand, if the output result is not the estimated value of the process-saving one-step consensus process, the process proceeds to step S69.
- step S69 the combination unit 240 refers to the input of the saving process 1-step consensus processing or the input of the 2-step consensus processing (input from the client 3 or input from the pre-stage unit 220-A of the other server 1). Select one of these inputs.
- step S71 the combination unit 240 inputs the estimated value referred to in steps S66 and S68 or the input selected in step S69 to the PAXOS agreement unit 230, and calculates an agreement value with each server 1. .
- step S72 the output from the PAXOS agreement unit 230 is output as an agreement value.
- the output of both the reduced process 1-step agreement process and the 2-step agreement process is shown.
- the reduced process 1-step agreement process and the output of the 2-step agreement are output. Executed when either one is received.
- the updating unit 130 sets the priority order for the definite value (determination result), and when the deterministic value with the higher priority order can be acquired, the determination result with the lower priority order is discarded. Then, the update unit 130 guarantees data consistency with only a fixed value having a higher priority.
- FIGS. 11 to 14 show a third embodiment of the present invention, in which data from the client 3 is processed in a partial order.
- FIG. 11 is a block diagram illustrating an example of data received by the transmission / reception unit 110 of each server 1 when processing data from the client 3 in a partial order according to the third embodiment of this invention.
- transmission / reception units 1 to n are components of the servers 1-1 to 1-n, and each transmission / reception unit 110 can store a plurality of data in the order of arrival.
- a buffer 115 is included.
- Other configurations are the same as those of the first embodiment.
- the update units 130-1 and 130-2 in FIG. 11 are components of the servers 1-1 and 1-2.
- the update unit 130 processes the input data.
- the final result is the same for all the update units 130 of the servers 1.
- the data of the update unit 130 of each server 1 is changed by changing the order of the data transmitted from the transmission / reception unit 110 to the update unit 130.
- the occurrence of collision is reduced by making these coincide with each other.
- the partial order processing is not a set of data transmitted by the client 3-1 by multicast, but all data in the buffer 115 as long as the transmission / reception unit 110 of the server 1 receives data whose order can be exchanged.
- the order in which data having the same number of quorums (5 in the example shown in the figure) is transmitted to the updating unit 130 as a determined value is determined.
- collision is resolved when data whose order is not exchangeable is received.
- FIG. 12 is a block diagram showing an example of a computer system that resolves data collisions in a partial order process.
- FIG. 12 shows an example in which a partial order process is applied in the process-saving one-step agreement unit 210, the two-step agreement unit 220 (the front-stage unit 220-A and the rear-stage unit 220-B), and the combination unit 240 to solve the data collision. Indicates.
- transmission / reception units 110-1 to 110-5 and pre-stage units 1 to 5 indicate components of servers 1 to 5, and update units 130-1 and 130-5 of server 1 and server 5 An example of an agreement is shown. Note that the transmission / reception units 110-1 to 110-5 each have the buffers 115 that hold a plurality of data in the input order as described above.
- Data “C”, “B”, and “A” are held in the buffers 115 of the transmission / reception units 110-1 and 110-2, and the data “C” and “A” are stored in the buffer 115 of the transmission / reception unit 110-3.
- “B” are held
- data “C”, “b”, “a” are held in the buffer 115 of the transmission / reception unit 110-4
- data “A” is held in the buffer 115 of the transmission / reception unit 110-5.
- “A” is retained.
- the data in the buffers 115 of the transmission / reception units 110-1 and 101-2 are the same, and in the buffer 115 of the transmission / reception units 110-3 to 110-5, the data “A”, “C”, “a”, “ b "collides.
- the estimated values of the process-saving one-step agreement unit 210 of the server 1 are “C”, “B”, and “A”, and the estimated values of the two-step agreement unit 220 are “C” and “A”.
- the estimated values of the process-saving one-step agreement unit 210 of the server 5 are “C”, “a”, and “b”, and the estimated values of the two-step agreement unit 220 are “C” and “A”.
- the estimated value of the two-step agreement unit 220 (hereinafter referred to as the two-step (2-STEP) estimation value) and the estimated value of the saving process one-step agreement unit 210 (hereinafter referred to as one step (S1 in the diagram)).
- -STEP) Estimated values are determined respectively. Then, as described later, the 2-step estimated value and the 1-step estimated value are output to the combination unit 240 in this order.
- the combination unit 240 combines the input two-step estimated value and the one-step estimated value, inputs the combined estimated value to the PAXOS agreement unit 230 that is an auxiliary agreement unit, and makes an agreement with another server 1.
- the updating unit 130 again separates the result of the agreement, which is the output of the PAXOS agreement unit 230, into a two-step estimated value and a one-step estimated value, and outputs them to each updating unit 130 in this order.
- the one-step estimated value may be transmitted first, even though the data A has already been transmitted to the updating unit 130-1.
- the data a whose order cannot be exchanged with the data A is transmitted to the update unit 130-5 first, and the consistency of the data is lost.
- a one-step estimated value is determined next to the two-step estimated value, and a PAXOS agreement (auxiliary agreement) is performed using an estimated value obtained by combining these values.
- auxiliary agreement is performed using an estimated value obtained by combining these values.
- FIG. 13 is a flowchart illustrating an example of a partial order process performed by the combination unit 240.
- FIG. 14 is a flowchart illustrating an example of processing performed in the collision resolution processing S84 of FIG.
- step S81 the combination unit 240 waits for input of either a confirmed value from the saving process 1-step agreement unit 210, a confirmed value from the 2-step agreement unit 220, or a collision determination value.
- the collision determination value is an estimated value or a resolved value (or indefinite or arbitrary value), and is an input from either the process-saving 1-step agreement unit 210 or the 2-step agreement unit 220.
- the combination unit 240 determines whether the received data is a collision determination value (S82). If a collision determination value is input, the process proceeds to step S84. If a confirmed value is input, the process proceeds to step S83.
- step S83 the combination unit 240 outputs the accepted fixed value as an agreed value, and returns to step S81 to repeat the process.
- step S84 the combination unit 240 executes the collision resolution process shown in FIG. 14, resolves the collision, and returns to step S81 to repeat the process.
- step S91 the combination unit 240 acquires an estimated value from the two-step agreement unit 220 having a high priority.
- the combination unit 240 sets the estimated value as the estimated value set 2. If there is no estimated value from the two-step agreement unit 220, the estimated value set 2 is set as an empty set and the process proceeds to the next step.
- step S92 the combination unit 240 acquires the estimated value from the low-process 1-step agreement unit 210 having a high priority.
- the combination unit 240 sets a portion obtained by removing the data of the estimated value set 2 set in step S91 from the estimated value as an estimated value set 1. If there is no estimated value from the reduced process 1-step agreement unit 210, the estimated value set 1 is set as an empty set and the process proceeds to the next step.
- step S93 the combination unit 240 acquires a solution value from the low-process 1-step agreement unit 210 having a low priority.
- the combination unit 240 obtains an indefinite set by subtracting the estimated value set 1 set in step S92 and the estimated value set 2 set in step S91 from the solution value. And If there is no solution value from the reduced process 1-step agreement unit 210, the indefinite set is set as an empty set and the process proceeds to the next step.
- step S94 the combination unit 240 inputs the estimated value set 1, the estimated value set 2, and the indefinite set to the PAXOS agreement unit 230 in this order, and causes the other server 1 to execute the PAXOS agreement.
- step S95 the PAXOS agreement unit 230 sets the PAXOS agreed value corresponding to the estimated value set 1 as the collisionless set 1, the PAXOS agreed value corresponding to the estimated value set 2 as the collisionless set 2, and the PAXOS corresponding to the indefinite set. Output the agreed value as a collision set.
- step S96 the updating unit 130 first outputs a collision-free set 2 corresponding to the estimated value set 2.
- step S97 the update unit 130 outputs a collision-free set 1 corresponding to the estimated value set 1.
- step S97 the update unit 130 outputs the collision set corresponding to the indefinite set in a predetermined order.
- the combination unit 240 constituting the update unit 130 can resolve the collision and guarantee the consistency of data based on the PAXOS agreement.
- the description of the indefinite set and the collision set is omitted.
- the same processing as in FIG. 14 may be performed.
- the updating unit 130 sets a priority order for the determination result, and when a definite value (determination result) with the highest priority order cannot be obtained, the determination result with the lower priority order is used. Ensure data consistency. In addition, it is desirable to use all the determination results with lower priority.
- the configuration of the computer, the processing unit, and the processing unit described in the present invention may be partially or entirely realized by dedicated hardware.
- the various software exemplified in the present embodiment can be stored in various recording media (for example, non-transitory storage media) such as electromagnetic, electronic, and optical, and through a communication network such as the Internet. It can be downloaded to a computer.
- recording media for example, non-transitory storage media
- a communication network such as the Internet. It can be downloaded to a computer.
- the present invention is not limited to the above-described embodiments, and includes various modifications.
- the above-described embodiments have been described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described.
- a storage medium storing a program for controlling a server, A first determination unit for determining consistency of multiplexed data; a first step of determining consistency of received data; Although the allowable number of failures of the server is smaller than that of the first determination unit, the second determination unit having a large minimum number of communication times between the servers in order to determine the consistency of the data.
- a second step of determining sex When the data consistency determination result is received from the first determination unit or the second determination unit, and the determination result includes data that guarantees the consistency, the consistency is guaranteed.
- a non-transitory computer-readable storage medium storing a program for causing the server to execute the program.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Hardware Redundancy (AREA)
Abstract
Description
プロセス数n=2f+1
として表される。上記より、最低でも計算機(マスタとスレーブ)間で2回の通信が必要となり、許容故障数eはn/2未満である。なお、許容故障数eは、障害が発生しても最小通信回数(レイテンシー)を維持可能なプロセス(または計算機)の数である。また、レイテンシーは、クライアントがデータの更新(または参照)をマスタ計算機に要求してから、スレーブ計算機で合意に達する(データの一貫性を保証する)までの最小の通信回数δとする。
図15は、省プロセス1ステップ合意部210の一例を示すブロック図である。省プロセス1ステップ合意アルゴリズムによるデータの同一性(一貫性)の判定と、同一性が保証されない場合(以下、衝突とする)の解決手法は、前記非特許文献1の1ステップ合意アルゴリズムと同様で、次の通りである。
・データの同一性:確定値同士は必ず一致する。
・衝突の解決:確定値がいずれかのサーバ1に存在する可能性がある場合、推定値は必ず前記確定値と一致する。(確定値がいずれかのサーバ1に存在する可能性がある場合、確定値と一致する推定値が必ず存在し、解決値(推定値)となる)。それ以外の場合、解決値は入力データ(あるいは提案)の値の中の任意の一つと一致する。
Qe≧(3/4)n となる最小の整数Qe
Qf>n/2 となる最小の整数Qf
となる。なお、過半数はn/2より大きい最小の整数とする。
・確定クォーラム内で、全てのデータの値が一致すればその値を確定値に出力し、そうでなければ空、を確定値に出力。
・確定クォーラムが存在することが判定できない場合、確定値には何も出力されない。
・推定クォーラム内で、ある過半数部分以上のデータの値が一致すればその値を推定値に出力し、そうでなければ空を推定値に出力。
・推定値が空でなければ推定値を解決値として出力し、空であれば入力されたデータの中の任意の一つを解決値とする。なお、解決値や推定値について他の更新部130とPAXOS合意を実施することで、同一性を確保することができる。
図16は、2ステップ合意部220の一例を示すブロック図である。2ステップ合意アルゴリズムによるデータの同一性の判定と、同一性が保証されない衝突の解決手法は、上述の省プロセス1ステップ合意アルゴリズム(及び1ステップ合意アルゴリズム)と同様である。
・計数クォーラム内で、全てのデータが一致すれば、その値を確定値に出力する。
・計数クォーラム内で、空でないデータが存在すればその値を推定値とし、そうでなければ空を推定値として出力する。
図3は、本発明のサーバ1で行われる分散データ管理の一例を示すシーケンス図である。図示の例ではクライアント3-1がデータAの更新要求をサーバ1-1~1-nへマルチキャストで送信する例を示す。なお、クライアント3-1から各サーバ1へのマルチキャストによる送信は、図示しない管理計算機などが行っても良い。
図6は、各サーバで行われる処理の一例を示すフローチャートである。この処理は、クライアント3からデータの更新要求(または参照要求や登録要求)を受信したときに実行される。
サーバを制御するプログラムを格納した記憶媒体であって、
多重化されたデータの一貫性を判定する第1の判定部で、受信したデータの一貫性を判定する第1のステップと、
前記第1の判定部よりも前記サーバの許容故障数は小さいが、前記データの一貫性を判定するために前記サーバ間の最小通信回数が多い第2の判定部で、前記受信したデータの一貫性を判定する第2のステップと、
前記第1の判定部または前記第2の判定部から前記データの一貫性の判定結果を受け付けて、前記判定結果が前記一貫性を保証するデータを含む場合には、当該一貫性が保証されたデータを出力する第3のステップと、
前記一貫性が保証されたデータを格納する第4のステップと、
を前記サーバに実行させるプログラムを格納した非一時的計算機読み取り可能な記憶媒体。
Claims (16)
- プロセッサとメモリとストレージ装置を備えたサーバを複数有し、データを前記複数のサーバで受信して格納し、前記データを多重化して保持するデータ管理システムであって、
前記サーバは、
前記多重化された前記データの一貫性を判定する第1の判定部と、
前記多重化された前記データの一貫性を判定する際に、前記第1の判定部よりも前記サーバの許容故障数は大きいが、前記データの一貫性を判定するために前記サーバ間の最小通信回数が多い第2の判定部と、
前記第1の判定部または前記第2の判定部から前記データの一貫性の判定結果を受け付けて、前記判定結果が前記一貫性を保証するデータを含む場合には、当該一貫性が保証されたデータを出力する組み合わせ部と、
前記組み合わせ部が出力した前記データを格納するデータ格納部と、
を備えることを特徴とするデータ管理システム。 - 請求項1に記載のデータ管理システムであって、
前記第2の判定部は、
前記データの一貫性の判定を行って第1の判定結果を出力する前段判定部と、
他のサーバの前記前段判定部から前記第1の判定結果を受信し、複数の前記第1の判定結果から前記データの一貫性を判定し、第2の判定結果を出力する後段判定部と、を含み、
前記組み合わせ部は、
前記第1の判定部と第2の判定部の双方の判定結果が一貫性を保証できない場合には、前記第2の判定部の後段判定部で前記サーバの過半数から受信した第2の判定結果のうち部分一致しているデータを出力することを特徴とするデータ管理システム。 - 請求項2に記載のデータ管理システムであって、
前記第1の判定部は、
全てのサーバの過半数から受信したデータに基づいて前記データの一貫性を判定し、
前記組み合わせ部は、
前記第2の判定部の後段判定部が前記サーバの過半数から受信した第2の判定結果のうち部分一致しているデータがない場合、前記第1の判定部が前記サーバから受信したデータのうち過半数が一致しているデータを出力することを特徴とするデータ管理システム。 - 請求項3に記載のデータ管理システムであって、
前記組み合わせ部は、
前記第1の判定部が前記サーバから受信したデータのうち過半数が一致しているデータがない場合には、前記第1の判定部または第2の判定部から所定の解決値を取得し、当該解決値を出力することを特徴とするデータ管理システム。 - 請求項2に記載のデータ管理システムであって、
PAXOSアルゴリズムを用いて前記データの一貫性の判定を行う第3の判定部をさらに有し、
前記組み合わせ部は、
前記第2の判定部の後段判定部で前記サーバの過半数から受信した第2の判定結果のうち部分一致しているデータがない場合には、前記第3の判定部の判定結果に基づくデータを出力することを特徴とするデータ管理システム。 - 請求項5に記載のデータ管理システムであって、
前記第3の判定部が所定時間内に予め設定した回数を超えて実行された場合には、前記第1の判定部と第2の判定部の処理を実行せずに、前記第3の判定部が出力したデータを前記データ格納部で格納することを特徴とするデータ管理システム。 - 請求項1に記載のデータ管理システムであって、
前記組み合わせ部は、
前記第1の判定部と第2の判定部の判定結果に優先順位を設定し、前記優先順位が上位の判定結果が取得できた場合には、優先順位が下位の判定結果を破棄し、前記優先順位が上位の判定結果でデータの一貫性を保証することを特徴とするデータ管理システム。 - 請求項1に記載のデータ管理システムであって、
前記組み合わせ部は、
前記第1の判定部と第2の判定部の判定結果に優先順位を設定し、前記優先順位が最上位の判定結果が取得できない場合には、優先順位が下位の判定結果を利用してデータの一貫性を保証することを特徴とするデータ管理システム。 - プロセッサとメモリとストレージ装置を備えたサーバを複数有し、データを前記複数のサーバで受信して格納し、前記データを多重化して保持するデータ管理方法であって、
前記サーバが、前記多重化された前記データの一貫性を判定する第1の判定部で、前記受信したデータの一貫性を判定する第1のステップと、
前記サーバが、前記第1の判定部よりも前記サーバの許容故障数は大きいが、前記データの一貫性を判定するために前記サーバ間の最小通信回数が多い第2の判定部で、前記受信したデータの一貫性を判定する第2のステップと、
前記サーバが、前記第1の判定部または前記第2の判定部から前記データの一貫性の判定結果を受け付けて、前記判定結果が前記一貫性を保証するデータを含む場合には、当該一貫性が保証されたデータを出力する第3のステップと、
前記サーバが、前記一貫性が保証されたデータを格納する第4のステップと、
を含むことを特徴とするデータ管理方法。 - 請求項9に記載のデータ管理方法であって、
前記第2の判定部は、
前記データの一貫性の判定を行って第1の判定結果を出力する前段判定部と、
他のサーバの前記前段判定部から前記第1の判定結果を受信し、複数の前記第1の判定結果から前記データの一貫性を判定し、第2の判定結果を出力する後段判定部と、を含み、
前記第3のステップは、
前記第1の判定部と第2の判定部の双方の判定結果が一貫性を保証できない場合には、前記第2の判定部の後段判定部で前記サーバの過半数から受信した第2の判定結果のうち部分一致しているデータを出力することを特徴とするデータ管理方法。 - 請求項10に記載のデータ管理方法であって、
前記第1の判定部は、
全てのサーバの過半数から受信したデータに基づいて前記データの一貫性を判定し、
前記第3のステップは、
前記第2の判定部の後段判定部が前記サーバの過半数から受信した第2の判定結果のうち部分一致しているデータがない場合、前記第1の判定部が前記サーバから受信したデータのうち過半数が一致しているデータを出力することを特徴とするデータ管理方法。 - 請求項11に記載のデータ管理方法であって、
前記第3のステップは、
前記第1の判定部が前記サーバから受信したデータのうち過半数が一致しているデータがない場合には、前記第1の判定部または第2の判定部から所定の解決値を取得し、当該解決値を出力することを特徴とするデータ管理方法。 - 請求項10に記載のデータ管理方法であって、
前記サーバは、PAXOSアルゴリズムを用いて前記データの一貫性の判定を行う第3の判定部をさらに有し、
前記第3のステップは、
前記第2の判定部の後段判定部で前記サーバの過半数から受信した第2の判定結果のうち部分一致しているデータがない場合には、前記第3の判定部の判定結果に基づくデータを出力することを特徴とするデータ管理方法。 - 請求項13に記載のデータ管理方法であって、
前記第3のステップは、
前記第3の判定部が所定時間内に予め設定した回数を超えて実行された場合には、前記第1の判定部と第2の判定部の処理を実行せずに、前記第3の判定部が出力したデータを格納することを特徴とするデータ管理方法。 - 請求項9に記載のデータ管理方法であって、
前記第3のステップは、
前記第1の判定部と第2の判定部の判定結果に優先順位を設定し、前記優先順位が上位の判定結果が取得できた場合には、優先順位が下位の判定結果を破棄し、前記優先順位が上位の判定結果でデータの一貫性を保証することを特徴とするデータ管理方法。 - 請求項9に記載のデータ管理方法であって、
前記第3のステップは、
前記第1の判定部と第2の判定部の判定結果に優先順位を設定し、前記優先順位が最上位の判定結果が取得できない場合、優先順位が下位の判定結果を利用してデータの一貫性を保証することを特徴とするデータ管理方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2014/064739 WO2015186191A1 (ja) | 2014-06-03 | 2014-06-03 | データ管理システム及びデータ管理方法 |
US15/125,715 US10545949B2 (en) | 2014-06-03 | 2014-06-03 | Data management system and data management method |
JP2016524971A JP6271003B2 (ja) | 2014-06-03 | 2014-06-03 | データ管理システム及びデータ管理方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2014/064739 WO2015186191A1 (ja) | 2014-06-03 | 2014-06-03 | データ管理システム及びデータ管理方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015186191A1 true WO2015186191A1 (ja) | 2015-12-10 |
Family
ID=54766291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2014/064739 WO2015186191A1 (ja) | 2014-06-03 | 2014-06-03 | データ管理システム及びデータ管理方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US10545949B2 (ja) |
JP (1) | JP6271003B2 (ja) |
WO (1) | WO2015186191A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2020115315A (ja) * | 2019-01-18 | 2020-07-30 | 株式会社日立製作所 | 分散処理方法及び分散処理システム |
JP2020187526A (ja) * | 2019-05-14 | 2020-11-19 | 株式会社日立製作所 | 分散処理方法、分散処理システム及びサーバ |
JP2022160937A (ja) * | 2021-04-07 | 2022-10-20 | 株式会社日立製作所 | 分散合意方法、分散システム及び分散合意プログラム |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006155614A (ja) * | 2004-11-23 | 2006-06-15 | Microsoft Corp | 一般化されたPaxos |
JP2010122773A (ja) * | 2008-11-18 | 2010-06-03 | Hitachi Ltd | 分散処理システム、処理割当方法、および情報処理装置 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5261085A (en) | 1989-06-23 | 1993-11-09 | Digital Equipment Corporation | Fault-tolerant system and method for implementing a distributed state machine |
US20020143724A1 (en) * | 2001-01-16 | 2002-10-03 | International Business Machines Corporation | Method, system and computer program product to partition filter rules for efficient enforcement |
AU2003217599A1 (en) * | 2002-02-22 | 2003-09-09 | Bea Systems, Inc. | System and method for using a data replication service to manage a configuration repository |
US7395279B2 (en) * | 2003-11-17 | 2008-07-01 | International Business Machines Corporation | System and method for achieving different levels of data consistency |
US7353285B2 (en) * | 2003-11-20 | 2008-04-01 | International Business Machines Corporation | Apparatus, system, and method for maintaining task prioritization and load balancing |
US8549180B2 (en) * | 2004-10-22 | 2013-10-01 | Microsoft Corporation | Optimizing access to federation infrastructure-based resources |
US20080071878A1 (en) * | 2006-09-18 | 2008-03-20 | Reuter James M | Method and system for strong-leader election in a distributed computer system |
JP5213108B2 (ja) * | 2008-03-18 | 2013-06-19 | 株式会社日立製作所 | データ複製方法及びデータ複製システム |
US20110004521A1 (en) * | 2009-07-06 | 2011-01-06 | Yahoo! Inc. | Techniques For Use In Sorting Partially Sorted Lists |
JP2011123817A (ja) * | 2009-12-14 | 2011-06-23 | Fujitsu Ltd | ジョブ振分装置、ジョブ振分プログラム及びジョブ振分方法 |
US8726036B2 (en) * | 2011-09-20 | 2014-05-13 | Wallrust, Inc. | Identifying peers by their interpersonal relationships |
US9747310B2 (en) * | 2012-06-04 | 2017-08-29 | Google Inc. | Systems and methods of increasing database access concurrency using granular timestamps |
-
2014
- 2014-06-03 WO PCT/JP2014/064739 patent/WO2015186191A1/ja active Application Filing
- 2014-06-03 US US15/125,715 patent/US10545949B2/en active Active
- 2014-06-03 JP JP2016524971A patent/JP6271003B2/ja active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006155614A (ja) * | 2004-11-23 | 2006-06-15 | Microsoft Corp | 一般化されたPaxos |
JP2010122773A (ja) * | 2008-11-18 | 2010-06-03 | Hitachi Ltd | 分散処理システム、処理割当方法、および情報処理装置 |
Non-Patent Citations (1)
Title |
---|
CAMARGOS ET AL., MULTICOORDINATED PAXOS, IN: PODC '07 PROCEEDINGS OF THE TWENTY-SIXTH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2007, pages 316 - 317, XP055240877, ISBN: 978-1-59593-616-5 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2020115315A (ja) * | 2019-01-18 | 2020-07-30 | 株式会社日立製作所 | 分散処理方法及び分散処理システム |
US11106552B2 (en) | 2019-01-18 | 2021-08-31 | Hitachi, Ltd. | Distributed processing method and distributed processing system providing continuation of normal processing if byzantine failure occurs |
JP2020187526A (ja) * | 2019-05-14 | 2020-11-19 | 株式会社日立製作所 | 分散処理方法、分散処理システム及びサーバ |
US11354206B2 (en) | 2019-05-14 | 2022-06-07 | Hitachi, Ltd. | Distributed processing method, distributed processing system, and server |
JP2022160937A (ja) * | 2021-04-07 | 2022-10-20 | 株式会社日立製作所 | 分散合意方法、分散システム及び分散合意プログラム |
JP7225298B2 (ja) | 2021-04-07 | 2023-02-20 | 株式会社日立製作所 | 分散合意方法、分散システム及び分散合意プログラム |
Also Published As
Publication number | Publication date |
---|---|
JPWO2015186191A1 (ja) | 2017-04-20 |
JP6271003B2 (ja) | 2018-01-31 |
US20170011086A1 (en) | 2017-01-12 |
US10545949B2 (en) | 2020-01-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10148751B1 (en) | Asymmetric active-active storage for hyper-converged system | |
EP3274853B1 (en) | Direct memory access descriptor processing | |
US20170054802A1 (en) | Read-after-write consistency in data replication | |
US10489378B2 (en) | Detection and resolution of conflicts in data synchronization | |
US10430217B2 (en) | High availability using dynamic quorum-based arbitration | |
US10503415B2 (en) | Snapshot processing method and related device | |
CN105677673B (zh) | 业务处理方法、装置及系统 | |
US10108605B1 (en) | Natural language processing system and method | |
JP6271003B2 (ja) | データ管理システム及びデータ管理方法 | |
CN112884086A (zh) | 模型训练方法、装置、设备、存储介质以及程序产品 | |
US20170177696A1 (en) | Usage of modeled validations on mobile devices in online and offline scenarios | |
US8972365B2 (en) | Storage system and storage device | |
CN112527901A (zh) | 数据存储系统、方法、计算设备及计算机存储介质 | |
WO2016101759A1 (zh) | 一种数据路由方法、数据管理装置和分布式存储系统 | |
EP2620876B1 (en) | Method and apparatus for data processing, pci-e bus system and server | |
US20160057068A1 (en) | System and method for transmitting data embedded into control information | |
US20150213102A1 (en) | Synchronous data replication in a content management system | |
US9208114B2 (en) | Storage device, computer-readable recording medium, and storage control method | |
CN108011926B (zh) | 一种报文发送方法、处理方法、服务器及系统 | |
US10831561B2 (en) | Method for changing allocation of data using synchronization token | |
RU2698766C1 (ru) | Способ и устройство для передачи, отправки и получения информации | |
US10353735B2 (en) | Computing system including independent coupling facilities maintaining equivalency based on sequence values | |
US20170300356A1 (en) | Fine-grain synchronization in data-parallel jobs | |
US11327785B2 (en) | Computing system including enhanced application performance based on last completed operation sequence value | |
US20150052263A1 (en) | Information processing system and control method of information processing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14893815 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016524971 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15125715 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14893815 Country of ref document: EP Kind code of ref document: A1 |