CN111104070A - Method and system for realizing data consistency in distributed system - Google Patents
Method and system for realizing data consistency in distributed system Download PDFInfo
- Publication number
- CN111104070A CN111104070A CN201911346226.6A CN201911346226A CN111104070A CN 111104070 A CN111104070 A CN 111104070A CN 201911346226 A CN201911346226 A CN 201911346226A CN 111104070 A CN111104070 A CN 111104070A
- Authority
- CN
- China
- Prior art keywords
- entry
- data consistency
- distributed
- implementing data
- executed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000010586 diagram Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0656—Data buffering arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method and a system for realizing data consistency in a distributed system. The method comprises the following steps: assigning a designated entry ID for each IO operation; and executing IO operations on all the copy nodes in parallel according to the entry ID. The method and the system for realizing data consistency in the distributed system can ensure the data consistency among different copy nodes on the premise of the concurrency of IO operation.
Description
Technical Field
The present invention relates to the field of distributed storage technologies, and in particular, to a method and a system for implementing data consistency in a distributed system.
Background
Since distributed storage is multi-copy, strong consistency of data between copies needs to be ensured. When multiple clients concurrently write, a conflict is easily generated.
Referring to FIG. 1, assume that there are two clients that want to modify key x at the same time, but with different results. Client1 wants to modify x from 3 to 4, and Client2 wants to modify x from 3 to 5.
Assume that Client1 successfully modified the first copy from 3 to 4; while Client2 successfully modifies the third copy from 3 to 5. Then Client1 would fail to modify the third copy because the value of the third copy has changed to 5. Likewise, Client2 may fail to modify the first copy.
This is the "conflict" mentioned earlier. Since you do not know in this system whether the final value of x should be 4 or 5, or some other value. More seriously, the system cannot recover from this "conflict" state, and there is no ultimate consistency.
Existing solutions typically use logs to serialize all requests. The request flow after using the log is sequenced into a string, then the requests are read from the log in sequence, and the local state is modified. Please refer to fig. 2 for a specific implementation procedure.
The write requests are sequenced by using the log, the consistency of write IO is guaranteed, but the performance is not optimal, and the next write IO can be issued only by returning confirmation to the client after all copies are written. Some copies are written first and must wait for the slow copy to be written, and the performance of the disk is not fully utilized.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method and a system for implementing data consistency in a distributed system, which can ensure data consistency between different replica nodes on the premise of IO operation concurrency.
In order to solve the above technical problem, the present invention provides a method for implementing data consistency in a distributed storage system, which is executed by a client node of the distributed storage system, and the method includes: assigning a designated entry ID for each IO operation; and executing IO operations on all the copy nodes in parallel according to the entry ID.
In some embodiments, the IO operations include: and (4) writing.
In some embodiments, assigning a specified entry ID for each IO operation includes: and according to the sequence generated by each IO operation, assigning a corresponding entry ID for each IO operation sequence.
In some embodiments, the sequentially executed entry IDs are incremented in chronological order, from early to late.
In some embodiments, further comprising: and acquiring ACK messages sequentially returned by the replica nodes after IO operation on the replica nodes is executed according to the entry IDs.
In some embodiments, the replica node returns an ACK message to the client node according to the ascending order of the entry IDs corresponding to the IO operations.
In addition, the present invention also provides a system for implementing data consistency in a distributed system, which is integrated in a client node of a distributed storage system, and the system comprises: one or more processors; a storage device for storing one or more programs, which when executed by the one or more processors, cause the one or more processors to implement the method for implementing data consistency in a distributed system according to the foregoing description.
After adopting such design, the invention has at least the following advantages:
according to the method and the system for realizing data consistency in the distributed system, the uniform entry ID is provided for each IO operation, and data IO on the copy node is carried out according to the uniformly arranged entry ID, so that write IO consistency can be ensured, and write performance is improved.
Drawings
The foregoing is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood, the present invention is further described in detail below with reference to the accompanying drawings and the detailed description.
FIG. 1 is a schematic diagram of the data consistency problem in a distributed system provided by the prior art;
FIG. 2 is a schematic diagram of data consistency guaranteed using a log as provided by the prior art;
FIG. 3 is a flowchart of a method for implementing data consistency in a distributed system according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating an implementation principle of a method for data consistency in a distributed system according to an embodiment of the present invention;
fig. 5 is a structural diagram of a system for implementing data consistency in a distributed system according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Fig. 3 shows an implementation process of the implementation method for data consistency in the distributed system provided by the invention. Referring to fig. 3, the method for implementing data consistency in a distributed system includes:
s31, assigning each IO operation a designated entry ID.
S32, executing IO operations on all copy nodes in parallel according to the entry ID.
In the implementation shown in FIG. 3, data consistency between different replica nodes is guaranteed by the entry ID specified for each IO operation. That is, each time an IO operation is executed, the client node assigns a certain entry ID to the IO operation, and then hands the IO operation over all the replica nodes involved in the IO operation to complete the IO operation.
And executing each IO operation by each copy node according to the acquired entry ID. And each copy node independently executes IO operation according to the acquired entry ID. That is, each replica node executes IO operations in parallel according to the acquired entry ID.
Because each IO operation is assigned a fixed entry ID, and each replica node executes IO according to the entry ID, data consistency among the replica nodes can be ensured.
In addition, the execution of the parallel execution of the IO operation by each copy node is greatly improved in execution efficiency compared with a log serialization mode.
Compared with a log serialization scheme, the method for realizing data consistency in the distributed system can achieve efficiency improvement, and the main reason is that the subsequent IO operation can be executed without waiting for the ACK message of the previous IO operation. In the technical scheme provided by the invention, ACK is not a main technical means for ensuring data consistency. The main means of the present invention to ensure data consistency is in the entry ID. In addition, parallel IO between different replica nodes can greatly improve the execution efficiency.
Fig. 4 shows the principle of the implementation shown in fig. 3. Referring to fig. 4, in the technical scheme, the client node slices the data and has an internal logic number, and the data is concurrently written when being issued, and the consistency is ensured by the logic number.
Each write io will be assigned a sequence number that is sequentially incremented from 0, referred to as Entry ID, also referred to as Entry ID.
Each Entry will be sent to all copies in parallel. And all entries are sent in a pipelined fashion. That is, it means that the write request for sending the (N + 1) th record does not need to wait for the write request for sending the nth record to return.
The sending of the write records may be out of order, but an acknowledgement (Acknowledge) will Acknowledge in order of the Entry ID, thus achieving strict ordering of the log.
Fig. 5 is a block diagram of a system for implementing data consistency in a distributed system of the present invention. Fig. 5 shows a system for implementing data consistency in a distributed system, which corresponds to a client node in a distributed storage system, and referring to fig. 5, the system for implementing data consistency in a distributed system includes: a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for system operation are also stored. The CPU 501, ROM502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
The technical scheme of the invention has the following beneficial effects:
write IO consistency is guaranteed, and IO performance is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the present invention in any way, and it will be apparent to those skilled in the art that the above description of the present invention can be applied to various modifications, equivalent variations or modifications without departing from the spirit and scope of the present invention.
Claims (7)
1. A method for implementing data consistency in a distributed storage system is executed by a client node of the distributed storage system, and is characterized by comprising the following steps:
assigning a designated entry ID for each IO operation;
and executing IO operations on all the copy nodes in parallel according to the entry ID.
2. The method for implementing data consistency in a distributed system according to claim 1, wherein the IO operation includes: and (4) writing.
3. The method of claim 1, wherein assigning a specific entry ID to each IO operation comprises:
and according to the sequence generated by each IO operation, assigning a corresponding entry ID for each IO operation sequence.
4. The method of claim 3, wherein the sequentially executed entry IDs are sequentially increased from early to late in time order.
5. The method for implementing data consistency in a distributed system according to claim 3, further comprising:
and acquiring ACK messages sequentially returned by the replica nodes after IO operation on the replica nodes is executed according to the entry IDs.
6. The method according to claim 5, wherein the replica node returns an ACK message to the client node according to an increasing order of the entry IDs corresponding to the IO operations.
7. A system for implementing data consistency in a distributed storage system, integrated in a client node of the distributed storage system, comprising:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for implementing data consistency in the distributed system according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911346226.6A CN111104070A (en) | 2019-12-24 | 2019-12-24 | Method and system for realizing data consistency in distributed system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911346226.6A CN111104070A (en) | 2019-12-24 | 2019-12-24 | Method and system for realizing data consistency in distributed system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111104070A true CN111104070A (en) | 2020-05-05 |
Family
ID=70423598
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911346226.6A Pending CN111104070A (en) | 2019-12-24 | 2019-12-24 | Method and system for realizing data consistency in distributed system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111104070A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239013A (en) * | 2021-05-17 | 2021-08-10 | 北京青云科技股份有限公司 | Distributed system and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050216675A1 (en) * | 2004-03-25 | 2005-09-29 | International Business Machines Corporation | Method and apparatus for directory-based coherence with distributed directory management |
CN103294675A (en) * | 2012-02-23 | 2013-09-11 | 上海盛霄云计算技术有限公司 | Method and device for updating data in distributed storage system |
CN103297268A (en) * | 2013-05-13 | 2013-09-11 | 北京邮电大学 | P2P (peer to peer) technology based distributed data consistency maintaining system and method |
CN103986694A (en) * | 2014-04-23 | 2014-08-13 | 清华大学 | Control method of multi-replication consistency in distributed computer data storing system |
CN104484130A (en) * | 2014-12-04 | 2015-04-01 | 北京同有飞骥科技股份有限公司 | Construction method of horizontal expansion storage system |
CN109327539A (en) * | 2018-11-15 | 2019-02-12 | 上海天玑数据技术有限公司 | A kind of distributed block storage system and its data routing method |
-
2019
- 2019-12-24 CN CN201911346226.6A patent/CN111104070A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050216675A1 (en) * | 2004-03-25 | 2005-09-29 | International Business Machines Corporation | Method and apparatus for directory-based coherence with distributed directory management |
CN103294675A (en) * | 2012-02-23 | 2013-09-11 | 上海盛霄云计算技术有限公司 | Method and device for updating data in distributed storage system |
CN103297268A (en) * | 2013-05-13 | 2013-09-11 | 北京邮电大学 | P2P (peer to peer) technology based distributed data consistency maintaining system and method |
CN103986694A (en) * | 2014-04-23 | 2014-08-13 | 清华大学 | Control method of multi-replication consistency in distributed computer data storing system |
CN104484130A (en) * | 2014-12-04 | 2015-04-01 | 北京同有飞骥科技股份有限公司 | Construction method of horizontal expansion storage system |
CN109327539A (en) * | 2018-11-15 | 2019-02-12 | 上海天玑数据技术有限公司 | A kind of distributed block storage system and its data routing method |
Non-Patent Citations (1)
Title |
---|
王泰格等: "分布式存储系统介绍及其数据一致性实现方法探究", 《企业技术开发》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239013A (en) * | 2021-05-17 | 2021-08-10 | 北京青云科技股份有限公司 | Distributed system and storage medium |
CN113239013B (en) * | 2021-05-17 | 2024-04-09 | 北京青云科技股份有限公司 | Distributed system and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11042501B2 (en) | Group-based data replication in multi-tenant storage systems | |
US11036540B2 (en) | Transaction commit operations with thread decoupling and grouping of I/O requests | |
JP5191062B2 (en) | Storage control system, operation method related to storage control system, data carrier, and computer program | |
US9575927B2 (en) | RDMA-optimized high-performance distributed cache | |
US7624112B2 (en) | Asynchronously storing transaction information from memory to a persistent storage | |
US6848021B2 (en) | Efficient data backup using a single side file | |
CN110806933B (en) | Batch task processing method, device, equipment and storage medium | |
US20160092488A1 (en) | Concurrency control in a shared storage architecture supporting on-page implicit locks | |
AU2002308664B2 (en) | Reducing latency and message traffic during data and lock transfer in a multi-node system | |
WO2019037617A1 (en) | Data transaction processing method, device, and electronic device | |
WO2018018611A1 (en) | Task processing method and network card | |
CN109241015B (en) | Method for writing data in a distributed storage system | |
WO2020025049A1 (en) | Data synchronization method and apparatus, database host, and storage medium | |
WO2017143824A1 (en) | Transaction execution method, apparatus, and system | |
CN110851276A (en) | Service request processing method, device, server and storage medium | |
JP2023541298A (en) | Transaction processing methods, systems, devices, equipment, and programs | |
US11099960B2 (en) | Dynamically adjusting statistics collection time in a database management system | |
US8359601B2 (en) | Data processing method, cluster system, and data processing program | |
CN111104070A (en) | Method and system for realizing data consistency in distributed system | |
CN111782419B (en) | Cache updating method, device, equipment and storage medium | |
US11281654B2 (en) | Customized roll back strategy for databases in mixed workload environments | |
WO2023246236A1 (en) | Node configuration method, transaction log synchronization method and node for distributed database | |
WO2023116827A1 (en) | High-concurrency data storage method and system | |
US10447607B2 (en) | System and method for dequeue optimization using conditional iteration | |
CN111240810A (en) | Transaction management method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200505 |