CN112463306A - Method for sharing disk data consistency in virtual machine - Google Patents

Method for sharing disk data consistency in virtual machine Download PDF

Info

Publication number
CN112463306A
CN112463306A CN202011409290.7A CN202011409290A CN112463306A CN 112463306 A CN112463306 A CN 112463306A CN 202011409290 A CN202011409290 A CN 202011409290A CN 112463306 A CN112463306 A CN 112463306A
Authority
CN
China
Prior art keywords
virtual machine
shared disk
storage layer
request
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011409290.7A
Other languages
Chinese (zh)
Other versions
CN112463306B (en
Inventor
袁进坤
张荣波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Astute Tec Co ltd
Original Assignee
Nanjing Astute Tec Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Astute Tec Co ltd filed Critical Nanjing Astute Tec Co ltd
Priority to CN202011409290.7A priority Critical patent/CN112463306B/en
Publication of CN112463306A publication Critical patent/CN112463306A/en
Application granted granted Critical
Publication of CN112463306B publication Critical patent/CN112463306B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0665Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45579I/O management, e.g. providing access to device drivers or storage

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for sharing disk data consistency in a virtual machine, which comprises the following steps of S1: when each virtual machine mounts the shared disk, writing a current timestamp and the unique identification number of the virtual machine into a metadata area in the corresponding log volume; s2: when a virtual machine initiates a read-write request to a mounted shared disk, a unique IO request number is acquired, and a storage layer merges a plurality of virtual machines according to the logical area position of a data block to be accessed: when the storage layer of the shared disk detects that each virtual machine accesses different logic data block areas in the shared disk, queuing is not needed, and concurrent execution can be performed; when the storage layer of the shared disk detects that at least two virtual machines initiate write operation or delete operation to the same logical data block area, each virtual machine queues in a thread pool in the shared disk according to the IO request number to wait for being executed. In the invention, a plurality of virtual machines control the concurrent reading and writing of the shared disk with finer granularity.

Description

Method for sharing disk data consistency in virtual machine
Technical Field
The invention belongs to the technical field of virtual machines, and particularly relates to a method for consistency of shared disk data in a virtual machine.
Background
The shared disk is a logical volume in the bottom data pool, and provides a random read-write block device which can be accessed in a shared mode for the virtual machine instance. The shared disk mainly meets the high availability requirement of the traditional cluster application program, but does not provide a cluster file system per se, and a third-party cluster file system is needed for managing the shared disk.
Common shared disks are controlled to access the shared disk by a Persistent Reservation (Persistent Reservation) command in the SCSI protocol, and the whole process is similar to a file lock. When each virtual machine mounts the shared disk, one key is registered on the corresponding logical volume, and a reserved flag bit is set, when conflict write occurs, the system sends conflict bit identification to other virtual machines, and at the moment, other virtual machines retry read-write requests at certain time intervals until the other party does not hold the reserved flag bit any more. However, this way of guaranteeing data consistency by conflicting bit identifications and retrying reads and writes greatly limits highly concurrent accesses.
Aiming at the problem that the mode of ensuring the data consistency through conflict bit identification and retry reading and writing greatly influences high concurrent access, an effective solution is not provided at present.
Disclosure of Invention
The invention aims to provide a method for sharing disk data consistency in a virtual machine, which solves the technical problem that high-concurrency access is greatly limited by a mode of ensuring data consistency through conflict bit identification and retry reading and writing in the prior art.
In order to solve the technical problems, the invention adopts the technical scheme that:
a method of shared disk data consistency in a virtual machine, comprising the steps of:
s1: the method comprises the steps of sharing a disk data volume and a log volume, wherein the data volume comprises a plurality of logic data blocks used for storing virtual machine data, and the log volume is used for storing a plurality of virtual machine concurrent transactions and the log volume of metadata. Wherein, the log volume stores read-write affairs in a key value pair mode; the storage layer of the shared disk asynchronously plays back the transaction logs into the corresponding data blocks, so that the entire shared disk can be mounted for use by multiple virtual machines at the same time.
When each virtual machine mounts the shared disk, writing the current timestamp and the unique identification number of the virtual machine into a metadata area in the corresponding log volume;
s2: when a virtual machine initiates a read-write request to a mounted shared disk, each virtual machine firstly acquires a unique IO request number from a storage layer of the shared disk:
s21: when the storage layer of the shared disk detects that each virtual machine accesses different logic data block areas in the shared disk, queuing is not needed, and concurrent execution can be performed;
s22: when the storage layer of the shared disk detects that at least two virtual machines initiate write operation or delete operation to the same logical data block area, each virtual machine queues up in a thread pool in the shared disk according to the IO request number.
Further optimization, in step S2, when the virtual machines send read-write requests to the mounted shared disk, each virtual machine first obtains a unique IO request number from the storage layer of the shared disk, and then puts the IO request number into an internal thread pool to queue and wait, and at the same time, the storage layer combines multiple virtual machines according to the logical area position of the data block to be accessed, and if it is detected that some IO requests are not overlapped with the logical area position of the data block of other IO requests, puts them into another thread to perform parallel processing; if there is overlap, wait in the present queue for the batch to be executed.
Further optimization, in a plurality of IO request queues in a certain logical data block area, when a delete operation request precedes other modify write operation requests, for all subsequent modify write operations in the queue, the storage layer will return similar error information, such as "the corresponding data area has been deleted" to the virtual machine corresponding to the upper layer, so as to further control the cluster file system in the virtual machine.
And further optimizing, after the storage layer has received an IO request of a certain virtual machine, but detects that the virtual machine initiating the IO request is abnormally down, the storage layer puts the IO request into a transaction log area in the corresponding log volume, and when waiting for the virtual machine to be powered on and mounted again, the storage layer returns warning information that some transactions which are not executed last time exist to the storage layer, so that the cluster file system in the virtual machine performs further processing.
Further optimization, when the mounted virtual machine is normally shut down or the shared disk is unloaded, the storage layer can clear the metadata information belonging to the virtual machine in the corresponding log volume.
The technical scheme of the invention has the beneficial effects that:
in the invention, the shared disk is divided into a data volume and a log volume, wherein the data volume comprises a plurality of logical data blocks for storing virtual machine data, and the log volume is used for storing a plurality of virtual machine concurrent transactions and the log volume of metadata. Wherein, the log volume stores read-write affairs in a key value pair mode; the storage layer of the shared disk asynchronously plays back the transaction logs into the corresponding data blocks, so that the entire shared disk can be mounted for use by multiple virtual machines at the same time. The multiple virtual machines perform finer-grained control on concurrent reading and writing of the shared disk, while the traditional concurrent control directly obtains the read-write operation permission of the whole shared disk through a SCSI lock, the granularity is larger, and the concurrency is not optimal.
Drawings
FIG. 1 is a block diagram of a shared disk among multiple virtual machines according to the present invention.
FIG. 2 is a flowchart of a method for sharing disk data consistency in a virtual machine according to the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
As shown in fig. 1, a method for sharing consistency of disk data in a virtual machine includes the following steps:
s1: the shared disk comprises a data volume and a log volume, wherein the data volume comprises a plurality of logical data blocks for storing virtual machine data, and the log volume is used for storing a plurality of virtual machine concurrent transactions and the log volume of metadata. Wherein, the log volume stores read-write affairs in a key value pair mode; the storage layer of the shared disk asynchronously plays back the transaction logs into the corresponding data blocks so that the entire shared disk can be mounted for use by multiple virtual machines simultaneously, as shown in fig. 2.
When N virtual machines mount the shared disk, each virtual machine writes a current timestamp and a unique identification number of the virtual machine into a metadata area in a corresponding log volume, wherein N is a positive integer greater than or equal to 2.
S2: when a virtual machine initiates a read-write request to a mounted shared disk, each virtual machine firstly acquires a unique IO request number from a storage layer of the shared disk:
s21: when the storage layer of the shared disk detects that each virtual machine accesses different logic data block areas in the shared disk, queuing is not needed, and concurrent execution can be performed;
s22: when the storage layer of the shared disk detects that at least two virtual machines initiate write operation or delete operation to the same logical data block area, each virtual machine queues up in a thread pool in the shared disk according to the IO request number.
In this embodiment, in step S2, each virtual machine first obtains a unique IO request number from the storage layer of the shared disk, and then puts the IO request number into an internal thread pool to wait in a queue, and at the same time, the storage layer combines multiple virtual machines according to the logical area position of the data block that needs to be accessed, and if it is detected that some IO requests do not overlap with the logical area position of the data block of other IO requests, puts them into another thread to perform parallel processing; if there is overlap, wait in the present queue for the batch to be executed.
In this embodiment, in a plurality of IO request queues in a certain logical data block area, when a delete operation request precedes other modify write operation requests, for all subsequent modify write operations in the queue, the storage layer returns similar error information, such as "the corresponding data area has been deleted" to the virtual machine corresponding to the upper layer, so that the cluster file system in the virtual machine is further controlled.
In this embodiment, after the storage layer has received an IO request of a certain virtual machine, but detects that the virtual machine initiating the IO request is abnormally down, the storage layer puts the IO request into a transaction log area in the corresponding log volume, and when waiting for the virtual machine to be powered on and mounted again, the storage layer returns warning information that there are some transactions that have not been executed last time to the storage layer, so that the cluster file system in the virtual machine performs further processing.
In this embodiment, when the mounted virtual machine is normally shut down or the shared disk is unloaded, the storage layer will clear the metadata information belonging to the virtual machine in the corresponding log volume.
The above examples are merely illustrative for clarity and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

Claims (5)

1. A method for shared disk data consistency in a virtual machine, comprising the steps of:
s1: sharing a disk data volume and a log volume, wherein the data volume comprises a plurality of logical data blocks, and the log volume comprises a metadata area and a plurality of transaction logs; when each virtual machine mounts the shared disk, writing the current timestamp and the unique identification number of the virtual machine into a metadata area in the corresponding log volume;
s2: when a virtual machine initiates a read-write request to a mounted shared disk, each virtual machine firstly acquires a unique IO request number from a storage layer of the shared disk:
s21: when the storage layer of the shared disk detects that each virtual machine accesses different logic data block areas in the shared disk, queuing is not needed, and concurrent execution can be performed;
s22: when the storage layer of the shared disk detects that at least two virtual machines initiate write operation or delete operation to the same logical data block area, each virtual machine queues in a thread pool in the shared disk according to the IO request number to wait for being executed.
2. The method according to claim 1, wherein in step S2, when the virtual machines send read-write requests to the mounted shared disk, each virtual machine first obtains a unique IO request number from the storage layer of the shared disk, and then puts the IO request number into an internal thread pool to wait in a queue, and at the same time, the storage layer combines multiple virtual machines according to a logical area position where a data block needs to be accessed, and if it is detected that some IO requests do not overlap with logical area positions of data blocks of other IO requests, puts them into another thread to perform parallel processing; if there is overlap, wait in the present queue for the batch to be executed.
3. The method according to claim 2, wherein in the plurality of IO request queues in a certain logical data block area, when the delete operation request precedes other modify write operation requests, the storage layer returns the information that the corresponding data area has been deleted to the upper layer corresponding virtual machine for all subsequent modify write operations in the queue.
4. The method according to claim 3, wherein after the storage layer has received an IO request from a virtual machine, but detects that the virtual machine that initiated the IO request is down abnormally, the storage layer puts the IO request into a transaction log area in a corresponding log volume, and when waiting for the virtual machine to boot again, the storage layer returns a warning message "there are some transactions that have not been executed last time" to the storage layer.
5. The method according to claim 1, wherein when the mounted virtual machine is normally powered off or the shared disk is unloaded, the storage layer will clear the metadata information belonging to the virtual machine in the corresponding journal volume.
CN202011409290.7A 2020-12-03 2020-12-03 Method for sharing disk data consistency in virtual machine Active CN112463306B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011409290.7A CN112463306B (en) 2020-12-03 2020-12-03 Method for sharing disk data consistency in virtual machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011409290.7A CN112463306B (en) 2020-12-03 2020-12-03 Method for sharing disk data consistency in virtual machine

Publications (2)

Publication Number Publication Date
CN112463306A true CN112463306A (en) 2021-03-09
CN112463306B CN112463306B (en) 2024-07-23

Family

ID=74805492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011409290.7A Active CN112463306B (en) 2020-12-03 2020-12-03 Method for sharing disk data consistency in virtual machine

Country Status (1)

Country Link
CN (1) CN112463306B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113568736A (en) * 2021-06-24 2021-10-29 阿里巴巴新加坡控股有限公司 Data processing method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102664923A (en) * 2012-03-30 2012-09-12 浪潮电子信息产业股份有限公司 Method for realizing shared storage pool by utilizing Linux global file system
US8495323B1 (en) * 2010-12-07 2013-07-23 Symantec Corporation Method and system of providing exclusive and secure access to virtual storage objects in a virtual machine cluster
CN107704194A (en) * 2016-08-08 2018-02-16 北京忆恒创源科技有限公司 Without lock I O process method and its device
CN108038236A (en) * 2017-12-27 2018-05-15 深信服科技股份有限公司 File sharing method, device, system and readable storage medium storing program for executing
CN110287044A (en) * 2019-07-02 2019-09-27 广州虎牙科技有限公司 Without lock shared drive processing method, device, electronic equipment and readable storage medium storing program for executing
CN110554834A (en) * 2018-06-01 2019-12-10 阿里巴巴集团控股有限公司 File system data access method and file system
CN110955610A (en) * 2018-09-27 2020-04-03 三星电子株式会社 Method for operating storage device, storage device and storage system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8495323B1 (en) * 2010-12-07 2013-07-23 Symantec Corporation Method and system of providing exclusive and secure access to virtual storage objects in a virtual machine cluster
CN102664923A (en) * 2012-03-30 2012-09-12 浪潮电子信息产业股份有限公司 Method for realizing shared storage pool by utilizing Linux global file system
CN107704194A (en) * 2016-08-08 2018-02-16 北京忆恒创源科技有限公司 Without lock I O process method and its device
CN111679795A (en) * 2016-08-08 2020-09-18 北京忆恒创源科技有限公司 Lock-free concurrent IO processing method and device
CN108038236A (en) * 2017-12-27 2018-05-15 深信服科技股份有限公司 File sharing method, device, system and readable storage medium storing program for executing
CN110554834A (en) * 2018-06-01 2019-12-10 阿里巴巴集团控股有限公司 File system data access method and file system
CN110955610A (en) * 2018-09-27 2020-04-03 三星电子株式会社 Method for operating storage device, storage device and storage system
CN110287044A (en) * 2019-07-02 2019-09-27 广州虎牙科技有限公司 Without lock shared drive processing method, device, electronic equipment and readable storage medium storing program for executing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113568736A (en) * 2021-06-24 2021-10-29 阿里巴巴新加坡控股有限公司 Data processing method and device
CN113568736B (en) * 2021-06-24 2024-07-30 阿里巴巴创新公司 Data processing method and device

Also Published As

Publication number Publication date
CN112463306B (en) 2024-07-23

Similar Documents

Publication Publication Date Title
US20140006687A1 (en) Data Cache Apparatus, Data Storage System and Method
JP2505939B2 (en) How to control data castout
KR100453228B1 (en) Journaling and recovery method for shared disk file system
US7818309B2 (en) Method for managing data access requests utilizing storage meta data processing
US20060143412A1 (en) Snapshot copy facility maintaining read performance and write performance
EP1199637A2 (en) Disk controller comprising cache memory and method of controlling the cache
US7539816B2 (en) Disk control device, disk control method
US7529902B2 (en) Methods and systems for locking in storage controllers
US8589438B2 (en) System for accessing shared data using multiple application servers
EP2144167B1 (en) Remote file system, terminal device, and server device
US6658541B2 (en) Computer system and a database access method thereof
US20100299512A1 (en) Network Boot System
KR20180025128A (en) Stream identifier based storage system for managing array of ssds
CN107888687B (en) Proxy client storage acceleration method and system based on distributed storage system
US20130290636A1 (en) Managing memory
CN113220490A (en) Transaction persistence method and system for asynchronous write-back persistent memory
EP4044015A1 (en) Data processing method and apparatus
CN114780025A (en) Software RAID (redundant array of independent disks) request processing method, controller and RAID storage system
US7593998B2 (en) File cache-controllable computer system
CN111061690B (en) RAC-based database log file reading method and device
CN112039999A (en) Method and system for accessing distributed block storage system in kernel mode
US6799228B2 (en) Input/output control apparatus, input/output control method and information storage system
CN112463306B (en) Method for sharing disk data consistency in virtual machine
US11442663B2 (en) Managing configuration data
JP2005258789A (en) Storage device, storage controller, and write back cache control method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant