CN116069246A - Data read-write method and system for virtual machine - Google Patents

Data read-write method and system for virtual machine Download PDF

Info

Publication number
CN116069246A
CN116069246A CN202211528898.0A CN202211528898A CN116069246A CN 116069246 A CN116069246 A CN 116069246A CN 202211528898 A CN202211528898 A CN 202211528898A CN 116069246 A CN116069246 A CN 116069246A
Authority
CN
China
Prior art keywords
channel
read
write request
shared memory
memory queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211528898.0A
Other languages
Chinese (zh)
Other versions
CN116069246B (en
Inventor
张朝潞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Original Assignee
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Topsec Technology Co Ltd, Beijing Topsec Network Security Technology Co Ltd, Beijing Topsec Software Co Ltd filed Critical Beijing Topsec Technology Co Ltd
Priority to CN202211528898.0A priority Critical patent/CN116069246B/en
Publication of CN116069246A publication Critical patent/CN116069246A/en
Application granted granted Critical
Publication of CN116069246B publication Critical patent/CN116069246B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Hardware Redundancy (AREA)

Abstract

The present disclosure relates to a data read-write method and system for a virtual machine, wherein the system includes: each host machine is configured with a virtual machine process and storage access software, wherein the storage access software consists of a shared memory queue group, an IO sending thread group, an IO receiving thread group, a channel selector, a channel load equalizer and a channel; the shared memory queue group is used for storing read-write requests; the IO sending thread group is used for sending the read-write request to the distributed storage system through the designated channel; the IO receiving thread group is used for receiving a response result of the read-write request from the appointed channel and writing the response result into the shared memory queue; a channel selector for selecting an effective channel from a plurality of channels; and the channel load equalizer is used for distributing the read-write requests to the shared memory queues corresponding to the channels according to the weight of each channel. According to the technical scheme, the storage IO path can be shortened, the complexity of a software architecture is reduced, and the performance and stability are improved.

Description

Data read-write method and system for virtual machine
Technical Field
The disclosure relates to the technical field of cloud computing, and in particular relates to a data read-write method, device and equipment for a virtual machine and a storage medium.
Background
In a virtualized or cloud computing environment, virtual hard disks of virtual machines are typically provided by distributed storage, accessed through a TCP/IP network, and in order to promote availability of network links, two or more independent storage networks are typically set up, so that network links for virtual disk access to the distributed storage are redundant.
Currently, a iSCSI (Internet Small Computer System Interface) mode is generally adopted to dock a distributed storage system. iSCSI is a C/S architecture, the C end is an iSCSI initiator, the S end is an iSCSI Target, in the mode, the read-write process involves memory copy between a plurality of user states and a kernel state, the IO path of a read-write request in a host is overlong, and the waste of system resources is caused.
Disclosure of Invention
In order to solve the technical problems, the present disclosure provides a data read-write method and system for a virtual machine.
In a first aspect, an embodiment of the present disclosure provides a data read-write system for a virtual machine, including:
a plurality of storage networks, a plurality of hosts;
wherein, a plurality of hosts are connected with a plurality of storage networks;
each host machine is configured with a virtual machine process corresponding to the virtual machine and storage access software, wherein the storage access software consists of a shared memory queue group, an IO sending thread group, an IO receiving thread group, a channel selector, a channel load equalizer and a channel;
the shared memory queue group consists of a plurality of shared memory queues and is used for storing read-write requests when the internal application program of the virtual machine reads and writes the virtual disk;
the IO sending thread group is used for acquiring the read-write request from the shared memory queue and sending the read-write request to the distributed storage system through a designated channel;
the IO receiving thread group is used for receiving a response result of the read-write request from a designated channel and writing the response result into the shared memory queue;
the channel selector is used for selecting an effective channel from a plurality of channels according to the detection result and the response result of the read-write request;
the channel load equalizer is configured to distribute each read-write request to a shared memory queue corresponding to each channel according to a weight of each channel, where the weight is determined according to a transmission capability of each storage network.
In a second aspect, an embodiment of the present disclosure provides a data read-write method for a virtual machine, including:
when an internal application program of the virtual machine reads and writes a virtual disk, virtual disk equipment in a virtual machine process receives a read-write request and stores the read-write request into a shared memory queue;
acquiring the read-write request from the shared memory queue based on an IO sending thread group;
determining a target channel from a plurality of channels based on a channel selector or a channel load equalizer according to a current working mode, and sending the read-write request to a distributed storage system through the target channel;
and receiving a response result of the read-write request from the target channel based on the IO receiving thread group, and writing the response result into the shared memory queue.
Optionally, the determining, according to the current operation mode, the target channel from the plurality of channels based on the channel selector or the channel load equalizer includes:
the current working mode is a main and standby mode, and a first channel is determined from a plurality of channels based on a channel selector to serve as the target channel;
and the current working mode is a multi-activity mode, and a channel corresponding to the current read-write request is determined from a plurality of channels by a weighted polling algorithm based on a channel load equalizer to serve as the target channel.
Optionally, the method further comprises:
the current working mode is a main and standby mode, and the channel selector periodically sends detection requests to each channel;
and when the abnormality of the target channel is detected, or the load of the target channel is higher than a threshold value, the channel selector determines a second channel from other available channels as the target channel.
Optionally, the method further comprises:
based on the IO receiving thread group, marking the read-write request which is sent and not received in the shared memory queue as a retry state;
and retransmitting the read-write request marked as the retry state through the second channel based on the IO sending thread group.
Optionally, the structure of the shared memory queue is composed of an instruction area and a data area, wherein the instruction area is used for recording metadata of read-write requests in the shared memory queue, each read-write request corresponds to an instruction item, and the structure of the instruction item is composed of a request state, retransmission times, a data starting position and a data quantity.
Optionally, the shared memory queue is implemented based on a multi-segment shared memory space, where the multi-segment shared memory space is sequentially formed by an instruction area of each channel and a data area of each channel.
Optionally, the method further comprises:
when the working mode is switched from a main mode to a standby mode to a multi-active mode, respectively distributing corresponding IO sending thread groups and IO receiving thread groups for each channel, and dividing the space of a corresponding shared memory queue;
the channel load equalizer is enabled and the channel selector is disabled.
Optionally, the method further comprises:
when the working mode is switched from the multi-active mode to the main-standby mode, a channel selector is started, a main channel is determined, and a load equalizer is deactivated;
disabling IO sending thread groups of other channels except the main channel, and managing all instruction areas by the IO sending thread groups of the main channel;
IO receiving thread groups of other channels except the main channel continue to complete the read-write request which is sent and not responded until the read-write request processing is stopped and exited when the read-write request processing is completed.
In a third aspect, an embodiment of the present disclosure provides a data read-write apparatus for a virtual machine, including:
the virtual disk device in the virtual machine process receives a read-write request and stores the read-write request into a shared memory queue when an internal application program of the virtual machine reads and writes a virtual disk;
the acquisition module is used for acquiring the read-write request from the shared memory queue based on the IO sending thread group;
the sending module is used for determining a target channel from a plurality of channels based on a channel selector or a channel load equalizer according to the current working mode and sending the read-write request to a distributed storage system through the target channel;
and the response module is used for receiving a response result of the read-write request from the target channel based on the IO receiving thread group and writing the response result into the shared memory queue.
In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instruction from the memory, and execute the instruction to implement the data read-write method for a virtual machine according to the second aspect.
In a fifth aspect, an embodiment of the present disclosure provides a computer readable storage medium, where a computer program is stored, where the computer program is executed by a processor to implement the data read-write method for a virtual machine according to the second aspect.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages: the multi-path virtual machine storage method and device can realize multi-path use and storage of the virtual machine, shorten storage IO paths, reduce components, reduce complexity of a software architecture and improve performance and stability.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, the drawings that are required for the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a schematic diagram of a data read-write system for a virtual machine according to an embodiment of the disclosure;
fig. 2 is a schematic diagram of a data read-write method for a virtual machine according to an embodiment of the disclosure;
FIG. 3 is a schematic diagram of a master/slave mode according to an embodiment of the disclosure;
FIG. 4 is a schematic diagram of a multiple active mode provided by an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a shared memory space according to an embodiment of the disclosure;
FIG. 6 is a schematic diagram of another shared memory space according to an embodiment of the disclosure;
fig. 7 is a schematic structural diagram of a data read-write device for a virtual machine according to an embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, a further description of aspects of the present disclosure will be provided below. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced otherwise than as described herein; it will be apparent that the embodiments in the specification are only some, but not all, embodiments of the disclosure.
The embodiment of the disclosure provides a data read-write system for a virtual machine, comprising: a plurality of storage networks and a plurality of hosts; storage IO (Input/Output) multipaths require multiple physical networks to implement, and multiple hosts are connected to multiple storage networks.
As shown in fig. 1, a schematic diagram of a data read-write system based on two storage networks is shown.
In this embodiment, the two storage networks are formed by connecting the storage switch 1 and the storage switch 2 with different network cards of each host respectively.
Each host machine is configured with a virtual machine process corresponding to the virtual machine and storage access software, wherein the storage access software consists of a shared memory queue group, an IO sending thread group, an IO receiving thread group, a channel selector, a channel load equalizer and a channel;
the shared memory queue group consists of a plurality of shared memory queues and is used for storing read-write requests when the internal application program of the virtual machine reads and writes the virtual disk;
IO sending thread group, which is used to obtain the read-write request from the shared memory queue and send the read-write request to the distributed storage system through the appointed channel;
IO receiving thread group, which is used to receive the response result of the read-write request from the appointed channel and write the response result into the shared memory queue;
the channel selector is used for selecting an effective channel from a plurality of channels according to the detection result and the response result of the read-write request;
and the channel load equalizer is used for distributing the read-write requests to the shared memory queues corresponding to the channels according to the weight of each channel, wherein the weight is determined according to the transmission capacity of each storage network.
In this embodiment, the architecture of computing (virtualization) and storage fusion is related, the virtualization provides life cycle management capability of a virtual machine, and the distributed storage system abstracts and manages the physical hard disk of the host machine uniformly and provides storage service for the outside. The virtual machine disk drive is a part of an internal operating system of the virtual machine, the virtual machine disk device is provided by a virtual machine process in a software simulation mode, a virtual machine monitor in a Linux system is Qemu/KVM, and simulation of a disk is provided by Qemu. In this embodiment, a BlockDriver driver (DSBD driver) based on a distributed storage system is implemented (Distributed Storage System BlockDriver).
Compared with the architecture of a distributed storage system which is realized by a virtual machine through an iSCSI multipath technology in the related art, a read-write request in the iSCSI architecture is from a virtual machine process (user mode) to an iSCSI initiator (kernel mode), to an iSCSI Target (user mode) and to a kernel mode (software sending network request), and relates to memory copying between the user mode and the kernel mode for three times, and the IO path of the read-write request in a host machine is overlong, so that the waste of system resources is caused. In addition, the implementation architecture is complex, and a DeviceMapper (kernel md driver), an iSCSI initiator (kernel iSCSI driver, user state iSCSI process) and an iSCSI Target (tgt) are introduced, so that read-write requests are transmitted in the same host through a TCP/IP network, and when the virtual machine is configured to store the IO multipath, the software module, the network communication port and the like introduced above need to be configured, so that the configuration and management are complex, and further the stability of the whole software is reduced.
Therefore, the system of the embodiment of the disclosure realizes the use and storage of the virtual machine multipath through the architecture, and relates to the memory copy from one user state to the kernel state, thereby reducing the complexity of the system structure and the IO path length and improving the system performance and stability.
Fig. 2 is a schematic flow chart of a data read-write method for a virtual machine according to an embodiment of the present disclosure, where the method provided by the embodiment of the present disclosure may be performed by a data read-write device for a virtual machine, and the device may be implemented by using software and/or hardware and may be integrated on any electronic device with computing capability.
As shown in fig. 2, a data read-write method for a virtual machine provided by an embodiment of the present disclosure may include:
in step 201, when an internal application program of the virtual machine reads and writes to the virtual disk, a virtual disk device in the virtual machine process receives a read-write request, and stores the read-write request in a shared memory queue.
Step 202, obtaining a read-write request from a shared memory queue based on the IO sending thread group.
The method of the embodiment is realized based on the data read-write system.
In this embodiment, the shared memory queue is used for storing read-write requests, and the shared memory queue is implemented based on a shared memory space. The shared memory queue group is a shared memory-based annular queue distributed by storage access software and is used for storing read-write requests, and the DSBD drive places the read-write requests into the annular queue. The IO sending thread group consists of one or more threads, and the threads work by acquiring a read-write request from a shared memory queue, processing the read-write request and then sending the read-write request to the distributed storage system through a designated channel.
Step 203, determining a target channel from a plurality of channels based on a channel selector or a channel load equalizer according to the current working mode, and sending the read-write request to the distributed storage system through the target channel.
The working modes of the storage access software comprise a main and standby mode and a multi-activity mode. The channels are packages that establish TCP/IP connections with the distributed storage, one channel per storage network. And the channel selector is used for selecting an effective channel according to the result of the read-write request and the result of self detection. And the channel load equalizer sets weight for each channel according to the transmission capacity of each storage network, and distributes read-write requests to the shared memory queues corresponding to each channel based on the weight value.
In one embodiment of the present disclosure, in the active/standby mode, only one storage network is used to transmit data in the plurality of storage networks at the same time, and the storage access software establishes channels to connect with the distributed storage system through the plurality of storage networks, respectively, where only one of the plurality of channels is in effect at the same time. The IO sending thread obtains the read-write request from the shared memory queue, selects a main channel through the channel selector, and sends the read-write request to the distributed storage system through the channel. For the system shown in fig. 1, the memory access software establishes a connection to the distributed storage system via two storage networks, channel 1 and channel 2, respectively, only one of which is active at the same time.
Referring to fig. 3, the current operation mode is a primary and secondary mode, and a first channel is determined from a plurality of channels based on a channel selector as a target channel for transmitting all read and write requests.
Optionally, in the active-standby mode, the channel selector periodically sends a probe request to each channel, and when detecting that the target channel has an abnormality or that the load of the target channel is higher than a threshold, the channel selector determines a second channel from other available channels as the target channel. The channel selector periodically sends a detection request to each channel, and if abnormality such as disconnection, overtime and the like of a certain channel occurs or the difference between the used channel load and other available channel loads is larger than a threshold value, the channel selector switches the other available channels to serve as a main channel in a main-standby mode.
Optionally, based on the IO receiving thread group, the read-write request which is sent in the shared memory queue and does not receive the response is marked as a retry state, and based on the IO sending thread group, the read-write request marked as the retry state is retransmitted through the second channel.
In one embodiment of the present disclosure, in the multiple active mode, multiple storage networks are used simultaneously, and then a shared memory queue, an IO sending thread group, and an IO receiving thread group are configured for each channel, respectively. For the steps, after the DSBD drive receives the read-write request, the read-write request is delivered to the channel load equalizer, the channel load equalizer puts the read-write request into the shared memory queue of the corresponding channel through the weighted polling algorithm, and the processing mode of the subsequent IO sending thread and the IO receiving thread is the same as that of the main and standby modes.
Referring to fig. 4, a channel corresponding to the read-write request is determined from a plurality of channels as a target channel through a weighted polling algorithm based on a channel load balancer. It should be noted that the two shared memory spaces (the plurality of shared memory queues) shown in fig. 4 are logical illustrations, and should not be construed as limiting the present application.
Step 204, receiving a response result of the read-write request from the target channel based on the IO receiving thread group, and writing the response result into the shared memory queue.
In the active/standby mode, the IO receiving thread group receives a response result of the read/write request from the current main channel, and writes the response result into the shared memory queue. In the multi-activity mode, for a certain channel, the IO receiving thread group of the channel receives a response result of the read-write request from the channel, and writes the response result into the shared memory queue corresponding to the channel. The IO receiving thread group consists of one or more threads, the threads work by receiving the response of the read-write request from the channel, and after processing, the result of the read-write request is written into the shared memory queue.
In this embodiment, the structure of the shared memory queue is composed of an instruction area and a data area, where the instruction area is used to record metadata of read-write requests in the shared memory queue, each read-write request corresponds to an instruction item, and the structure of the instruction item is composed of a request state, a retransmission number, a data starting position and a data volume.
As an example, as shown in fig. 5, the structure of the instruction item includes status, retries, offset and size. Wherein status occupies 4 bits, representing the request status: to be sent, to be retried, completed, in error, etc. The retries occupy 6 bits, representing the number of times the request is resent on the channel. The offset occupies 32 bits, which indicates that the data corresponding to the instruction is at the start position of the data area, and the maximum data area is 4GB. The size occupies 22 bits, the maximum data size of the read-write request corresponding to the instruction is 4MB, and the maximum depth of the hardware queue of the virtual disk is 1024, so the maximum data area is 4GB.
The shared memory space is described below. In this embodiment, the shared memory queue is implemented based on a multi-segment shared memory space, where the multi-segment shared memory space sequentially includes an instruction area of each channel and a data area of each channel, for example, as shown in fig. 6, in a multi-active mode for a two-channel system, the multi-segment shared memory space sequentially includes a channel 1 instruction area, a channel 2 instruction area, a channel 1 data area, and a channel 2 data area. Optionally, the number of items of the instruction area is the same as the depth of the virtual disk queue, multiple channels can share the queue, and security of lock-free access can be achieved by dividing the interval, as the instruction area and the data area of the channel 1 and the channel 2 in fig. 6 are both unshared.
Further, when the load of the virtual disk is smaller and the multiple active mode is suitable for the high load of the virtual disk, the free conversion between the main mode and the standby mode is supported in the embodiment, and compared with the mode which needs to be configured when being started in the existing implementation mode, the mode which is not restarted and cannot be modified after the configuration is completed, the main mode and the multiple active mode can be flexibly switched, and the system is not required to be restarted in the embodiment.
The switching of the operation mode is described below.
When the working mode is switched from a main mode to a standby mode to a multi-active mode, respectively distributing corresponding IO sending thread groups and IO receiving thread groups for each channel, and dividing the space of a corresponding shared memory queue; the channel load equalizer is enabled and the channel selector is disabled. In the active/standby mode, in the switching process of the original working channel, there may be an instruction area corresponding to the read/write request which is sent and not responded, and the instruction area is divided into other channels for management, and the read/write request is continuously processed by the original IO receiving thread without special processing.
When the working mode is switched from the multi-active mode to the main-standby mode, a channel selector is started, a main channel is determined, and a load equalizer is deactivated; disabling IO sending thread groups of other channels except the main channel, and managing all instruction areas by the IO sending thread groups of the main channel; IO receiving thread groups of other channels except the main channel continue to complete the read-write request which is sent and not responded until the read-write request processing is stopped and exited when the read-write request processing is completed. Therefore, the flexible switching of the main mode and the standby mode and the multi-active mode is realized, the overhead of the system is hardly increased, and the virtual disk can adapt to more complex loads.
According to the technical scheme of the embodiment of the disclosure, the multipath use and storage of the virtual machine can be realized, the storage IO path is shortened, components are reduced, the complexity of a software architecture is reduced, and the performance and stability are improved.
Fig. 7 is a schematic structural diagram of a data read-write device for a virtual machine according to an embodiment of the present disclosure, and as shown in fig. 7, the data read-write device for a virtual machine includes: the request module 71, the acquisition module 72, the transmission module 73, the response module 74.
The request module 71 is configured to receive a read-write request from a virtual disk device in a virtual machine process when an internal application program of the virtual machine reads and writes a virtual disk, and store the read-write request in a shared memory queue;
an obtaining module 72, configured to obtain the read-write request from the shared memory queue based on an IO sending thread group;
a sending module 73, configured to determine a target channel from a plurality of channels based on a channel selector or a channel load balancer according to a current operation mode, and send the read-write request to a distributed storage system through the target channel;
and a response module 74, configured to receive a response result of the read-write request from the target channel based on the IO receiving thread group, and write the response result into the shared memory queue.
In one embodiment of the present disclosure, the sending module 73 is specifically configured to: the current working mode is a main and standby mode, and a first channel is determined from a plurality of channels based on a channel selector to serve as the target channel; and the current working mode is a multi-activity mode, and a channel corresponding to the current read-write request is determined from a plurality of channels by a weighted polling algorithm based on a channel load equalizer to serve as the target channel.
In one embodiment of the present disclosure, the apparatus further comprises: the selection module is used for taking the current working mode as a main and standby mode, and the channel selector periodically sends detection requests to each channel; and when the abnormality of the target channel is detected, or the load of the target channel is higher than a threshold value, the channel selector determines a second channel from other available channels as the target channel.
In one embodiment of the present disclosure, the apparatus further comprises: the retry module is used for marking the read-write request which is sent and not received in the shared memory queue as a retry state based on the IO receiving thread group; and retransmitting the read-write request marked as the retry state through the second channel based on the IO sending thread group.
In one embodiment of the disclosure, the structure of the shared memory queue is composed of an instruction area and a data area, wherein the instruction area is used for recording metadata of read-write requests in the shared memory queue, each read-write request corresponds to an instruction item, and the structure of the instruction item is composed of a request state, a retransmission frequency, a data starting position and a data quantity.
In one embodiment of the present disclosure, the shared memory queue is implemented based on a multi-segment shared memory space, which is sequentially composed of an instruction area of each channel and a data area of each channel.
In one embodiment of the present disclosure, the apparatus further comprises: the first switching module is used for respectively distributing corresponding IO sending thread groups and IO receiving thread groups for each channel when the working mode is switched from the main mode to the standby mode to the multi-active mode, and dividing the space of the corresponding shared memory queue; the channel load equalizer is enabled and the channel selector is disabled.
In one embodiment of the present disclosure, the apparatus further comprises: the second switching module is used for starting the channel selector, determining a main channel and disabling the load equalizer when the working mode is switched from the multi-active mode to the main-standby mode; disabling IO sending thread groups of other channels except the main channel, and managing all instruction areas by the IO sending thread groups of the main channel; IO receiving thread groups of other channels except the main channel continue to complete the read-write request which is sent and not responded until the read-write request processing is stopped and exited when the read-write request processing is completed.
The data read-write device for the virtual machine provided by the embodiment of the disclosure can execute any data read-write method for the virtual machine provided by the embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the execution method. Details of the embodiments of the apparatus of the present disclosure that are not described in detail may refer to descriptions of any of the embodiments of the method of the present disclosure.
The embodiment of the disclosure also provides an electronic device, which comprises one or more processors and a memory. The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device to perform the desired functions. The memory may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, random Access Memory (RAM) and/or cache memory (cache) and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on a computer readable storage medium and a processor may execute the program instructions to implement the methods of embodiments of the present disclosure above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, and the like may also be stored in the computer-readable storage medium.
In one example, the electronic device may further include: input devices and output devices, which are interconnected by a bus system and/or other forms of connection mechanisms. In addition, the input device may include, for example, a keyboard, a mouse, and the like. The output device may output various information including the determined distance information, direction information, etc., to the outside. The output means may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc. In addition, the electronic device may include any other suitable components, such as a bus, input/output interface, etc., depending on the particular application.
In addition to the methods and apparatus described above, embodiments of the present disclosure may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform any of the methods provided by the embodiments of the present disclosure.
The computer program product may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform any of the methods provided by the embodiments of the present disclosure.
A computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown and described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A data read-write system for a virtual machine, comprising:
a plurality of storage networks, a plurality of hosts;
wherein, a plurality of hosts are connected with a plurality of storage networks;
each host machine is configured with a virtual machine process corresponding to the virtual machine and storage access software, wherein the storage access software consists of a shared memory queue group, an IO sending thread group, an IO receiving thread group, a channel selector, a channel load equalizer and a channel;
the shared memory queue group consists of a plurality of shared memory queues and is used for storing read-write requests when the internal application program of the virtual machine reads and writes the virtual disk;
the IO sending thread group is used for acquiring the read-write request from the shared memory queue and sending the read-write request to the distributed storage system through a designated channel;
the IO receiving thread group is used for receiving a response result of the read-write request from a designated channel and writing the response result into the shared memory queue;
the channel selector is used for selecting an effective channel from a plurality of channels according to the detection result and the response result of the read-write request;
the channel load equalizer is configured to distribute each read-write request to a shared memory queue corresponding to each channel according to a weight of each channel, where the weight is determined according to a transmission capability of each storage network.
2. A method of data reading and writing based on the system of claim 1, the method comprising:
when an internal application program of the virtual machine reads and writes a virtual disk, virtual disk equipment in a virtual machine process receives a read-write request and stores the read-write request into a shared memory queue;
acquiring the read-write request from the shared memory queue based on an IO sending thread group;
determining a target channel from a plurality of channels based on a channel selector or a channel load equalizer according to a current working mode, and sending the read-write request to a distributed storage system through the target channel;
and receiving a response result of the read-write request from the target channel based on the IO receiving thread group, and writing the response result into the shared memory queue.
3. The method of claim 2, wherein the determining a target channel from a plurality of channels based on a channel selector or a channel load balancer according to a current operating mode comprises:
the current working mode is a main and standby mode, and a first channel is determined from a plurality of channels based on a channel selector to serve as the target channel;
and the current working mode is a multi-activity mode, and a channel corresponding to the current read-write request is determined from a plurality of channels by a weighted polling algorithm based on a channel load equalizer to serve as the target channel.
4. A method as recited in claim 3, further comprising:
the current working mode is a main and standby mode, and the channel selector periodically sends detection requests to each channel;
and when the abnormality of the target channel is detected, or the load of the target channel is higher than a threshold value, the channel selector determines a second channel from other available channels as the target channel.
5. The method as recited in claim 4, further comprising:
based on the IO receiving thread group, marking the read-write request which is sent and not received in the shared memory queue as a retry state;
and retransmitting the read-write request marked as the retry state through the second channel based on the IO sending thread group.
6. The method of claim 2, wherein the shared memory queue structure is composed of an instruction area and a data area, wherein the instruction area is used for recording metadata of read-write requests in the shared memory queue, each read-write request corresponds to an instruction item, and the instruction item structure is composed of a request state, a retransmission time, a data starting position and a data quantity.
7. The method of claim 6, wherein the shared memory queue is implemented based on a multi-segment shared memory space that is comprised of an instruction region for each channel and a data region for each channel in turn.
8. The method as recited in claim 2, further comprising:
when the working mode is switched from a main mode to a standby mode to a multi-active mode, respectively distributing corresponding IO sending thread groups and IO receiving thread groups for each channel, and dividing the space of a corresponding shared memory queue;
the channel load equalizer is enabled and the channel selector is disabled.
9. The method as recited in claim 2, further comprising:
when the working mode is switched from the multi-active mode to the main-standby mode, a channel selector is started, a main channel is determined, and a load equalizer is deactivated;
disabling IO sending thread groups of other channels except the main channel, and managing all instruction areas by the IO sending thread groups of the main channel;
IO receiving thread groups of other channels except the main channel continue to complete the read-write request which is sent and not responded until the read-write request processing is stopped and exited when the read-write request processing is completed.
10. A data read-write apparatus for a virtual machine, comprising:
the virtual disk device in the virtual machine process receives a read-write request and stores the read-write request into a shared memory queue when an internal application program of the virtual machine reads and writes a virtual disk;
the acquisition module is used for acquiring the read-write request from the shared memory queue based on the IO sending thread group;
the sending module is used for determining a target channel from a plurality of channels based on a channel selector or a channel load equalizer according to the current working mode and sending the read-write request to a distributed storage system through the target channel;
and the response module is used for receiving a response result of the read-write request from the target channel based on the IO receiving thread group and writing the response result into the shared memory queue.
CN202211528898.0A 2022-11-30 2022-11-30 Data read-write method and system for virtual machine Active CN116069246B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211528898.0A CN116069246B (en) 2022-11-30 2022-11-30 Data read-write method and system for virtual machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211528898.0A CN116069246B (en) 2022-11-30 2022-11-30 Data read-write method and system for virtual machine

Publications (2)

Publication Number Publication Date
CN116069246A true CN116069246A (en) 2023-05-05
CN116069246B CN116069246B (en) 2023-08-29

Family

ID=86170709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211528898.0A Active CN116069246B (en) 2022-11-30 2022-11-30 Data read-write method and system for virtual machine

Country Status (1)

Country Link
CN (1) CN116069246B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693162A (en) * 2011-12-29 2012-09-26 中国科学技术大学苏州研究院 Method for process communication among multiple virtual machines on multi-core platform based on shared memory and intercore interruption
CN107491354A (en) * 2017-07-03 2017-12-19 北京东土科技股份有限公司 A kind of inter-virtual machine communication method and device based on shared drive
CN107491355A (en) * 2017-08-17 2017-12-19 山东浪潮商用系统有限公司 Funcall method and device between a kind of process based on shared drive
CN110018782A (en) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 A kind of data read/write method and relevant apparatus
CN111813579A (en) * 2020-07-17 2020-10-23 济南浪潮数据技术有限公司 Communication method, communication device, readable storage medium and file system
US20210073082A1 (en) * 2019-09-06 2021-03-11 Veritas Technologies Llc Systems and methods for marking application-consistent points-in-time
CN113110916A (en) * 2021-04-22 2021-07-13 深信服科技股份有限公司 Virtual machine data reading and writing method, device, equipment and medium
CN113590254A (en) * 2020-04-30 2021-11-02 深信服科技股份有限公司 Virtual machine communication method, device, system and medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693162A (en) * 2011-12-29 2012-09-26 中国科学技术大学苏州研究院 Method for process communication among multiple virtual machines on multi-core platform based on shared memory and intercore interruption
CN107491354A (en) * 2017-07-03 2017-12-19 北京东土科技股份有限公司 A kind of inter-virtual machine communication method and device based on shared drive
CN107491355A (en) * 2017-08-17 2017-12-19 山东浪潮商用系统有限公司 Funcall method and device between a kind of process based on shared drive
CN110018782A (en) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 A kind of data read/write method and relevant apparatus
US20210073082A1 (en) * 2019-09-06 2021-03-11 Veritas Technologies Llc Systems and methods for marking application-consistent points-in-time
CN113590254A (en) * 2020-04-30 2021-11-02 深信服科技股份有限公司 Virtual machine communication method, device, system and medium
CN111813579A (en) * 2020-07-17 2020-10-23 济南浪潮数据技术有限公司 Communication method, communication device, readable storage medium and file system
CN113110916A (en) * 2021-04-22 2021-07-13 深信服科技股份有限公司 Virtual machine data reading and writing method, device, equipment and medium

Also Published As

Publication number Publication date
CN116069246B (en) 2023-08-29

Similar Documents

Publication Publication Date Title
US10884799B2 (en) Multi-core processor in storage system executing dynamic thread for increased core availability
US9413683B2 (en) Managing resources in a distributed system using dynamic clusters
KR102506605B1 (en) Rack-level scheduling for reducing the long tail latency using high performance ssds
US20200233704A1 (en) Multi-core processor in storage system executing dedicated polling thread for increased core availability
US8271743B2 (en) Automated paging device management in a shared memory partition data processing system
US7694073B2 (en) Computer system and a method of replication
US20050097384A1 (en) Data processing system with fabric for sharing an I/O device between logical partitions
CN110609730B (en) Method and equipment for realizing interrupt transparent transmission between virtual processors
US20130067164A1 (en) Methods and structure for implementing logical device consistency in a clustered storage system
US20140095769A1 (en) Flash memory dual in-line memory module management
KR20180117641A (en) Efficient live-transfer of remotely accessed data
US8738890B2 (en) Coupled symbiotic operating system
US20120144146A1 (en) Memory management using both full hardware compression and hardware-assisted software compression
JP6028415B2 (en) Data migration control device, method and system for virtual server environment
US20210224177A1 (en) Performance monitoring for storage system with core thread comprising internal and external schedulers
WO2020219810A1 (en) Intra-device notational data movement system
US8099563B2 (en) Storage device and access instruction sending method
CN116069246B (en) Data read-write method and system for virtual machine
US9088569B2 (en) Managing access to a shared resource using client access credentials
US10788995B2 (en) Information processing apparatus, method and non-transitory computer-readable storage medium
US9256648B2 (en) Data handling in a cloud computing environment
KR101559929B1 (en) Apparatus and method for virtualization
US10437471B2 (en) Method and system for allocating and managing storage in a raid storage system
US7930438B2 (en) Interrogate processing for complex I/O link
US20200348962A1 (en) Memory-fabric-based processor context switching system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant