CN111104070A - Method and system for realizing data consistency in distributed system - Google Patents

Method and system for realizing data consistency in distributed system Download PDF

Info

Publication number
CN111104070A
CN111104070A CN201911346226.6A CN201911346226A CN111104070A CN 111104070 A CN111104070 A CN 111104070A CN 201911346226 A CN201911346226 A CN 201911346226A CN 111104070 A CN111104070 A CN 111104070A
Authority
CN
China
Prior art keywords
entry
data consistency
distributed
implementing data
executed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911346226.6A
Other languages
Chinese (zh)
Inventor
尹微
胡晓鹏
周泽湘
罗华
仇悦
文中领
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Toyou Feiji Electronics Co ltd
Original Assignee
Beijing Toyou Feiji Electronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Toyou Feiji Electronics Co ltd filed Critical Beijing Toyou Feiji Electronics Co ltd
Priority to CN201911346226.6A priority Critical patent/CN111104070A/en
Publication of CN111104070A publication Critical patent/CN111104070A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a system for realizing data consistency in a distributed system. The method comprises the following steps: assigning a designated entry ID for each IO operation; and executing IO operations on all the copy nodes in parallel according to the entry ID. The method and the system for realizing data consistency in the distributed system can ensure the data consistency among different copy nodes on the premise of the concurrency of IO operation.

Description

Method and system for realizing data consistency in distributed system
Technical Field
The present invention relates to the field of distributed storage technologies, and in particular, to a method and a system for implementing data consistency in a distributed system.
Background
Since distributed storage is multi-copy, strong consistency of data between copies needs to be ensured. When multiple clients concurrently write, a conflict is easily generated.
Referring to FIG. 1, assume that there are two clients that want to modify key x at the same time, but with different results. Client1 wants to modify x from 3 to 4, and Client2 wants to modify x from 3 to 5.
Assume that Client1 successfully modified the first copy from 3 to 4; while Client2 successfully modifies the third copy from 3 to 5. Then Client1 would fail to modify the third copy because the value of the third copy has changed to 5. Likewise, Client2 may fail to modify the first copy.
This is the "conflict" mentioned earlier. Since you do not know in this system whether the final value of x should be 4 or 5, or some other value. More seriously, the system cannot recover from this "conflict" state, and there is no ultimate consistency.
Existing solutions typically use logs to serialize all requests. The request flow after using the log is sequenced into a string, then the requests are read from the log in sequence, and the local state is modified. Please refer to fig. 2 for a specific implementation procedure.
The write requests are sequenced by using the log, the consistency of write IO is guaranteed, but the performance is not optimal, and the next write IO can be issued only by returning confirmation to the client after all copies are written. Some copies are written first and must wait for the slow copy to be written, and the performance of the disk is not fully utilized.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method and a system for implementing data consistency in a distributed system, which can ensure data consistency between different replica nodes on the premise of IO operation concurrency.
In order to solve the above technical problem, the present invention provides a method for implementing data consistency in a distributed storage system, which is executed by a client node of the distributed storage system, and the method includes: assigning a designated entry ID for each IO operation; and executing IO operations on all the copy nodes in parallel according to the entry ID.
In some embodiments, the IO operations include: and (4) writing.
In some embodiments, assigning a specified entry ID for each IO operation includes: and according to the sequence generated by each IO operation, assigning a corresponding entry ID for each IO operation sequence.
In some embodiments, the sequentially executed entry IDs are incremented in chronological order, from early to late.
In some embodiments, further comprising: and acquiring ACK messages sequentially returned by the replica nodes after IO operation on the replica nodes is executed according to the entry IDs.
In some embodiments, the replica node returns an ACK message to the client node according to the ascending order of the entry IDs corresponding to the IO operations.
In addition, the present invention also provides a system for implementing data consistency in a distributed system, which is integrated in a client node of a distributed storage system, and the system comprises: one or more processors; a storage device for storing one or more programs, which when executed by the one or more processors, cause the one or more processors to implement the method for implementing data consistency in a distributed system according to the foregoing description.
After adopting such design, the invention has at least the following advantages:
according to the method and the system for realizing data consistency in the distributed system, the uniform entry ID is provided for each IO operation, and data IO on the copy node is carried out according to the uniformly arranged entry ID, so that write IO consistency can be ensured, and write performance is improved.
Drawings
The foregoing is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood, the present invention is further described in detail below with reference to the accompanying drawings and the detailed description.
FIG. 1 is a schematic diagram of the data consistency problem in a distributed system provided by the prior art;
FIG. 2 is a schematic diagram of data consistency guaranteed using a log as provided by the prior art;
FIG. 3 is a flowchart of a method for implementing data consistency in a distributed system according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating an implementation principle of a method for data consistency in a distributed system according to an embodiment of the present invention;
fig. 5 is a structural diagram of a system for implementing data consistency in a distributed system according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Fig. 3 shows an implementation process of the implementation method for data consistency in the distributed system provided by the invention. Referring to fig. 3, the method for implementing data consistency in a distributed system includes:
s31, assigning each IO operation a designated entry ID.
S32, executing IO operations on all copy nodes in parallel according to the entry ID.
In the implementation shown in FIG. 3, data consistency between different replica nodes is guaranteed by the entry ID specified for each IO operation. That is, each time an IO operation is executed, the client node assigns a certain entry ID to the IO operation, and then hands the IO operation over all the replica nodes involved in the IO operation to complete the IO operation.
And executing each IO operation by each copy node according to the acquired entry ID. And each copy node independently executes IO operation according to the acquired entry ID. That is, each replica node executes IO operations in parallel according to the acquired entry ID.
Because each IO operation is assigned a fixed entry ID, and each replica node executes IO according to the entry ID, data consistency among the replica nodes can be ensured.
In addition, the execution of the parallel execution of the IO operation by each copy node is greatly improved in execution efficiency compared with a log serialization mode.
Compared with a log serialization scheme, the method for realizing data consistency in the distributed system can achieve efficiency improvement, and the main reason is that the subsequent IO operation can be executed without waiting for the ACK message of the previous IO operation. In the technical scheme provided by the invention, ACK is not a main technical means for ensuring data consistency. The main means of the present invention to ensure data consistency is in the entry ID. In addition, parallel IO between different replica nodes can greatly improve the execution efficiency.
Fig. 4 shows the principle of the implementation shown in fig. 3. Referring to fig. 4, in the technical scheme, the client node slices the data and has an internal logic number, and the data is concurrently written when being issued, and the consistency is ensured by the logic number.
Each write io will be assigned a sequence number that is sequentially incremented from 0, referred to as Entry ID, also referred to as Entry ID.
Each Entry will be sent to all copies in parallel. And all entries are sent in a pipelined fashion. That is, it means that the write request for sending the (N + 1) th record does not need to wait for the write request for sending the nth record to return.
The sending of the write records may be out of order, but an acknowledgement (Acknowledge) will Acknowledge in order of the Entry ID, thus achieving strict ordering of the log.
Fig. 5 is a block diagram of a system for implementing data consistency in a distributed system of the present invention. Fig. 5 shows a system for implementing data consistency in a distributed system, which corresponds to a client node in a distributed storage system, and referring to fig. 5, the system for implementing data consistency in a distributed system includes: a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for system operation are also stored. The CPU 501, ROM502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
The technical scheme of the invention has the following beneficial effects:
write IO consistency is guaranteed, and IO performance is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the present invention in any way, and it will be apparent to those skilled in the art that the above description of the present invention can be applied to various modifications, equivalent variations or modifications without departing from the spirit and scope of the present invention.

Claims (7)

1. A method for implementing data consistency in a distributed storage system is executed by a client node of the distributed storage system, and is characterized by comprising the following steps:
assigning a designated entry ID for each IO operation;
and executing IO operations on all the copy nodes in parallel according to the entry ID.
2. The method for implementing data consistency in a distributed system according to claim 1, wherein the IO operation includes: and (4) writing.
3. The method of claim 1, wherein assigning a specific entry ID to each IO operation comprises:
and according to the sequence generated by each IO operation, assigning a corresponding entry ID for each IO operation sequence.
4. The method of claim 3, wherein the sequentially executed entry IDs are sequentially increased from early to late in time order.
5. The method for implementing data consistency in a distributed system according to claim 3, further comprising:
and acquiring ACK messages sequentially returned by the replica nodes after IO operation on the replica nodes is executed according to the entry IDs.
6. The method according to claim 5, wherein the replica node returns an ACK message to the client node according to an increasing order of the entry IDs corresponding to the IO operations.
7. A system for implementing data consistency in a distributed storage system, integrated in a client node of the distributed storage system, comprising:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for implementing data consistency in the distributed system according to any one of claims 1 to 6.
CN201911346226.6A 2019-12-24 2019-12-24 Method and system for realizing data consistency in distributed system Pending CN111104070A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911346226.6A CN111104070A (en) 2019-12-24 2019-12-24 Method and system for realizing data consistency in distributed system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911346226.6A CN111104070A (en) 2019-12-24 2019-12-24 Method and system for realizing data consistency in distributed system

Publications (1)

Publication Number Publication Date
CN111104070A true CN111104070A (en) 2020-05-05

Family

ID=70423598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911346226.6A Pending CN111104070A (en) 2019-12-24 2019-12-24 Method and system for realizing data consistency in distributed system

Country Status (1)

Country Link
CN (1) CN111104070A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239013A (en) * 2021-05-17 2021-08-10 北京青云科技股份有限公司 Distributed system and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050216675A1 (en) * 2004-03-25 2005-09-29 International Business Machines Corporation Method and apparatus for directory-based coherence with distributed directory management
CN103294675A (en) * 2012-02-23 2013-09-11 上海盛霄云计算技术有限公司 Method and device for updating data in distributed storage system
CN103297268A (en) * 2013-05-13 2013-09-11 北京邮电大学 P2P (peer to peer) technology based distributed data consistency maintaining system and method
CN103986694A (en) * 2014-04-23 2014-08-13 清华大学 Control method of multi-replication consistency in distributed computer data storing system
CN104484130A (en) * 2014-12-04 2015-04-01 北京同有飞骥科技股份有限公司 Construction method of horizontal expansion storage system
CN109327539A (en) * 2018-11-15 2019-02-12 上海天玑数据技术有限公司 A kind of distributed block storage system and its data routing method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050216675A1 (en) * 2004-03-25 2005-09-29 International Business Machines Corporation Method and apparatus for directory-based coherence with distributed directory management
CN103294675A (en) * 2012-02-23 2013-09-11 上海盛霄云计算技术有限公司 Method and device for updating data in distributed storage system
CN103297268A (en) * 2013-05-13 2013-09-11 北京邮电大学 P2P (peer to peer) technology based distributed data consistency maintaining system and method
CN103986694A (en) * 2014-04-23 2014-08-13 清华大学 Control method of multi-replication consistency in distributed computer data storing system
CN104484130A (en) * 2014-12-04 2015-04-01 北京同有飞骥科技股份有限公司 Construction method of horizontal expansion storage system
CN109327539A (en) * 2018-11-15 2019-02-12 上海天玑数据技术有限公司 A kind of distributed block storage system and its data routing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王泰格等: "分布式存储系统介绍及其数据一致性实现方法探究", 《企业技术开发》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239013A (en) * 2021-05-17 2021-08-10 北京青云科技股份有限公司 Distributed system and storage medium
CN113239013B (en) * 2021-05-17 2024-04-09 北京青云科技股份有限公司 Distributed system and storage medium

Similar Documents

Publication Publication Date Title
US11042501B2 (en) Group-based data replication in multi-tenant storage systems
US11036540B2 (en) Transaction commit operations with thread decoupling and grouping of I/O requests
JP5191062B2 (en) Storage control system, operation method related to storage control system, data carrier, and computer program
US9575927B2 (en) RDMA-optimized high-performance distributed cache
US7624112B2 (en) Asynchronously storing transaction information from memory to a persistent storage
US6848021B2 (en) Efficient data backup using a single side file
CN110806933B (en) Batch task processing method, device, equipment and storage medium
US20160092488A1 (en) Concurrency control in a shared storage architecture supporting on-page implicit locks
AU2002308664B2 (en) Reducing latency and message traffic during data and lock transfer in a multi-node system
WO2019037617A1 (en) Data transaction processing method, device, and electronic device
WO2018018611A1 (en) Task processing method and network card
CN109241015B (en) Method for writing data in a distributed storage system
WO2020025049A1 (en) Data synchronization method and apparatus, database host, and storage medium
WO2017143824A1 (en) Transaction execution method, apparatus, and system
CN110851276A (en) Service request processing method, device, server and storage medium
JP2023541298A (en) Transaction processing methods, systems, devices, equipment, and programs
US11099960B2 (en) Dynamically adjusting statistics collection time in a database management system
US8359601B2 (en) Data processing method, cluster system, and data processing program
CN111104070A (en) Method and system for realizing data consistency in distributed system
CN111782419B (en) Cache updating method, device, equipment and storage medium
US11281654B2 (en) Customized roll back strategy for databases in mixed workload environments
WO2023246236A1 (en) Node configuration method, transaction log synchronization method and node for distributed database
WO2023116827A1 (en) High-concurrency data storage method and system
US10447607B2 (en) System and method for dequeue optimization using conditional iteration
CN111240810A (en) Transaction management method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200505