WO2019195969A1 - Data synchronization processing method and apparatus - Google Patents

Data synchronization processing method and apparatus Download PDF

Info

Publication number
WO2019195969A1
WO2019195969A1 PCT/CN2018/082225 CN2018082225W WO2019195969A1 WO 2019195969 A1 WO2019195969 A1 WO 2019195969A1 CN 2018082225 W CN2018082225 W CN 2018082225W WO 2019195969 A1 WO2019195969 A1 WO 2019195969A1
Authority
WO
WIPO (PCT)
Prior art keywords
thread
node
data packet
processed
buffer module
Prior art date
Application number
PCT/CN2018/082225
Other languages
French (fr)
Chinese (zh)
Inventor
王成
陈旭升
崔鹤鸣
沈伟锋
白龙
毕舒展
刘祖齐
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2018/082225 priority Critical patent/WO2019195969A1/en
Priority to CN201880004742.8A priority patent/CN110622478B/en
Publication of WO2019195969A1 publication Critical patent/WO2019195969A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication

Definitions

  • the present application relates to the field of computers, and in particular, to a method and apparatus for data synchronization processing.
  • Virtualization is part of a compute node (hereafter referred to as a "node") that provides an isolated virtualized computing environment.
  • a typical example of virtualization is a virtual machine (VM).
  • VM virtual machine
  • a virtual machine is a virtual device that is simulated on a physical device by virtual machine software. For applications running in virtual machines, these virtual machines work just like real physical devices, which can have operating systems and applications installed on them, and virtual machines can access network resources.
  • the same database that is, the distributed database
  • the virtual machine automatically takes over the business.
  • the working status of the active and standby VMs is usually not the same. Therefore, the standby VM needs a certain amount of time to synchronize the working status of the active and standby VMs before taking over the services.
  • the node where the primary virtual machine is located performs synchronization processing of the active and standby virtual machines based on the consistency negotiation protocol. For example, the active and standby virtual machines process data packets in the same order, periodically or irregularly. Synchronize the status of the active and standby VMs, thus reducing the difference in the working status of the active and standby VMs.
  • the thread of the master node (for example, the main loop thread) needs to occupy the global mutex when performing the synchronization processing of the master and slave VMs.
  • the global mutex lock prohibits other threads from accessing the code corresponding to the master virtual machine, resulting in the master virtual machine.
  • Other tasks cannot be processed while the main thread is synchronizing, resulting in a significant drop in the performance of the primary virtual machine.
  • the present application provides a method and apparatus for data synchronization processing, which enables a master node to use a primary virtual machine to process other tasks while performing synchronization processing of the primary and secondary virtual machines, thereby improving performance of the primary node.
  • a data synchronization processing method is provided, which is applied to a simulator of a master node in a computer system, the simulator is used to simulate a hardware device of a first virtual device of a master node, and the computer system further includes a connection with the master node.
  • the standby node includes: acquiring, by the first thread of the simulator, the first to-be-processed information, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet,
  • the first thread is a thread that executes the non-thread-safe code; the first to-be-processed information is written into the buffer module by the first thread; and the second pending thread of the simulator performs the consistency negotiation process on the first to-be-processed information, the consistency
  • the sex negotiation process is used to synchronize the order in which the primary node and the standby node process the first data packet; the first data packet is processed by the first thread according to the result of the consistency negotiation process.
  • the master node can mobilize the first thread and the second thread to execute the code to complete the corresponding task.
  • the first thread is a thread that executes the non-thread-safe code. Therefore, the first thread needs to occupy the mutex when performing the operation. For example, the first thread needs to occupy the global mutex before acquiring the first to-be-processed information.
  • the manner in which the first thread acquires the first to-be-processed information is not limited. After the first thread obtains the first to-be-processed information, the first to-be-processed information is written into the buffer module, where the buffer module may be a buffer queue or a heap or stack for buffering the first to-be-processed information.
  • the global mutex can be released, and other threads can occupy the global mutex and schedule the virtual machine to perform other tasks.
  • the second thread reads at least one to-be-processed information in the buffer module, and determines a common order in which the primary and secondary nodes process the data packets based on the consistency negotiation protocol, and then the first thread occupies the global mutex and processes according to the second thread. Process packets in sequence. Because the consistency negotiation between the active and standby nodes is performed by the second thread, the second thread does not need to occupy the global mutex when working. Therefore, the master node can use the primary virtual machine to process the synchronous processing of the active and standby virtual machines. Other tasks improve the performance of the primary node.
  • performing a consistency negotiation process on the first to-be-processed information by the second thread of the simulator including: reading, by the second thread, the first to-be-processed information from the buffering module;
  • the information execution consistency negotiation process determines the processed order of the first data packet; writes the first pending processing information to the pipeline through the second thread according to the processed order of the first data packet, where the pipeline is used for reading by the first thread A pending message.
  • the first data packet may be a data packet obtained from the client, or may be a data packet generated by the master node, or may be other data packets.
  • the specific content of the first data packet is not limited in this application.
  • the second thread can not directly call the program code of the master node as a worker thread.
  • the consistency negotiation processing scheme provided in this embodiment is in the first thread and the second thread. A pipeline is established for the connection, and the second thread writes the result of the consistency negotiation to the pipeline, so that the first thread reads the result of the consistency negotiation through the pipeline, thereby avoiding the master node while completing the consistency negotiation. The impact of security.
  • the reading, by the second thread, the first to-be-processed information from the buffer module includes: reading the first to-be-processed information buffer module from the buffer module by using the second thread at a preset time.
  • the preset time is, for example, a time corresponding to the timer event
  • the second thread may read the first to-be-processed information from the buffer module based on the trigger of the timer event, and the master node may set different timer events. Therefore, the foregoing embodiment can flexibly trigger the second thread to perform the consistency negotiation process.
  • the method before the first to-be-processed information is read from the buffer module by using the second thread, the method further includes: obtaining, by the second thread, the exclusive permission of the buffer module, where the exclusive permission of the buffer module is used to prohibit two or two More than one thread accesses the buffer module at the same time; after performing the consistency negotiation process on the first to-be-processed information by the second thread, the method further includes: when the number of pieces of information to be processed in the buffer module is 0, releasing the The exclusive permission of the buffer module obtained by the second thread.
  • the exclusive permission may also be called a queue mutual exclusion lock, which is used to prohibit two or more threads from accessing the buffer module at the same time.
  • the second thread releases the queue mutex, and other threads can continue to write new pending information to the buffer module.
  • the foregoing embodiment can prevent the new pending information from being inserted into the to-be-processed information queue that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
  • performing the consistency negotiation process on the first to-be-processed information by using the second thread including: determining, by the second thread, the quantity of information to be processed in the buffer module; and when the quantity of the to-be-processed information in the buffer module is greater than 0,
  • the second thread writes the data packet (including the first data packet) corresponding to the to-be-processed information into the consistency log and deletes the to-be-processed information in the buffer module, and the consistency log is used to cache the data packet, and the data packet in the consistency log
  • the sequence corresponds to the processed order of the data packets in the consistency log; the second thread sends a consistency negotiation request including the first data packet, and the consistency negotiation request is used to request the standby node to accept the first data packet. Processing sequence; receiving, by the second thread, a negotiation completion message, the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
  • the consistency negotiation process is performed, and then the to-be-processed information in the buffer module is deleted, so that the indication information in the buffer module read by the second thread each time is new pending information.
  • the second thread is prevented from reading the processed information to be processed, thereby improving the efficiency of the consistency negotiation process.
  • the method further includes: obtaining, by the first thread, exclusive rights of the buffer module, where the exclusive permission of the buffer module is used to prohibit two or two The above-mentioned thread accesses the buffer module at the same time; after the first thread writes the first to-be-processed information to the buffer module, the method further includes: releasing, by the first thread, the exclusive permission of the buffer module acquired by the first thread.
  • the exclusive permission may also be called a queue mutual exclusion lock, which is used to prohibit two or more threads from accessing the buffer module at the same time.
  • the second thread can occupy the queue mutex lock and read the pending information in the buffer module.
  • the first virtual device runs a primary database
  • the standby node is configured with a second virtual device
  • the second virtual device runs a standby database
  • the first data packet carries the client for sending to the primary node for the primary database.
  • Obtaining the first to-be-processed information by using the first thread of the simulator including: acquiring, by the first thread, the first to-be-processed information from the physical network card of the primary node;
  • Processing, by the first thread, the first data packet according to the result of the consistency negotiation process including: sending, by the first thread, the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node processing are processed in the same order The first packet.
  • the method further includes:
  • the third thread of the simulator obtains the load threshold of the master node and the same dirty page ratio of the master node and the standby node in the n-time synchronization operation, and the load threshold of the master node in the n-time synchronization operation is c 1 , . . . , c n , same dirty pages proportion to the n-th primary node and the standby node when the synchronization is w 1, ..., w n, where, c 1 corresponding to w 1, ..., c n and W n corresponds, n is greater than or equal to 2, Positive integer
  • L m is the load value of the primary node at the current time
  • the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node;
  • Performing a consistency negotiation process on the synchronization request by the second thread, and performing a consistency negotiation process on the synchronization request is used to synchronize the order in which the primary node and the standby node process the synchronization request;
  • the synchronization request is processed by the first thread according to the result of performing the consistency negotiation process on the synchronization request.
  • the master node determines whether to synchronize the synchronization between the active and standby nodes according to the current load value and the fixed load threshold. If the current load value is less than the fixed load threshold, the synchronization between the active and standby nodes is not started. When the load value at the current time is greater than or equal to the fixed load threshold, the synchronization of the active and standby nodes is started.
  • the above prior art has the disadvantage that it is difficult to determine the optimal starting timing for synchronizing the active and standby nodes according to a fixed load threshold, because if the fixed load threshold is set too small, for example, the fixed load threshold is set to 0, although When the load value of the master node meets the condition, the ratio of the same dirty pages of the active and standby nodes is the highest (because the virtual machines of the active and standby nodes are no longer working, the dirty pages are no longer changed), but the virtual machines of the primary and secondary nodes are loaded. The virtual machine resources of the active and standby nodes are wasted when the time between the detection and the data synchronization is idle. If the fixed load threshold is set too large, the virtual machines of the active and standby nodes are still working when the data is synchronized.
  • the virtual machine of the active and standby nodes has a small proportion of the same dirty pages.
  • the active and standby nodes need to transmit more data (that is, data corresponding to different dirty pages), which results in more data synchronization between the active and standby nodes.
  • the processor working time of the master node is 10 minutes
  • the virtual machine of the active and standby nodes has the same dirty page ratio of 80%
  • the processor life of the master node of the second load detection is 20 minutes.
  • the virtual machine of the active and standby nodes has the same dirty page ratio of 85%.
  • the processor working time of the primary node is 20 minutes
  • the virtual machine of the active and standby nodes has the same dirty page ratio of 85%.
  • the above data indicates that the virtual machine of the primary node has stopped working at least during the second load detection.
  • the virtual machine of the primary node has stopped working before the second load detection, and starts after the second load detection.
  • Data synchronization will inevitably cause the virtual machine of the primary node to be idle, and the virtual machine resources will be wasted. Therefore, the preferred data synchronization timing is after the first load detection and before the second load detection, when the data synchronization starts, the primary node
  • the virtual machine has completed most of the work or all of its work, which can achieve a better balance between virtual machine resource utilization and the same dirty page ratio.
  • the load threshold is determined according to the load threshold of the primary node and the same dirty page ratio at least two synchronization operations, for example, the same dirty page ratio 80% obtained at the first load detection is used as the weight value. Multiply the load threshold value 5 to obtain the result 4, and the same dirty page ratio 85% obtained at the second load detection is multiplied by the load threshold value 6 to obtain the result 5.1, and the sum of 4 and 5.1 is divided by the load detection number 2 to obtain The weighted average of the load thresholds for the two load tests is 4.55, which is the new load threshold.
  • the processor working time obtained by the third load detection is 22 minutes
  • the load value of the master node in the third load detection is 2 (the working time 22 obtained by the third load detection minus the second load detection).
  • the working time 20 obtains the load value of the master node when the load is detected for the third time.
  • the load value is less than the new load threshold of 4.55, indicating that the remaining tasks of the virtual machine of the master node are not much, and the virtual machine of the master node is soon Will enter the idle state, then start the data synchronization operation; if the processor working time of the third load detection is 30, the load value of the master node is 10 when the load is detected for the third time (the work of the third load detection) Time 30 minus the working time 20 obtained by the second load detection to obtain the load value of the master node in the third load detection 2), the load value is greater than the new load threshold 4.55, indicating that the remaining tasks of the virtual machine of the master node are still There are many.
  • the new load value threshold of 4.55 is determined whether the magnitude relationship data synchronization operations.
  • the new load threshold in this embodiment is a weighted average determined according to the result of the multiple load measurement, the load threshold will gradually converge to a more preferable load threshold as the number of load detections increases.
  • the load threshold is a dynamic and preferable threshold, and the virtual machine resource utilization ratio and the same dirty page ratio of the active and standby nodes are better when data synchronization is performed. The balance point.
  • the method further includes:
  • SUM k is the sum of the load value obtained from the first load measurement of the primary node to the load value obtained by measuring the kth load, and k is a positive integer;
  • T count is the load measurement threshold
  • c 0 is the load threshold of the first synchronization operation of the master node
  • the number of measurements (COUNT) of the load is equal to 0.
  • the first load value L 1 is obtained
  • SUM 1 is equal to L 1
  • the measurement number threshold T count is 2
  • the initial load threshold c 0 is equal to SUM 2 divided by 2 that is, the initial load threshold is positively correlated with SUM 2
  • the initial load threshold is negatively correlated with the number of measurements; if the measurement number threshold T count is 3.
  • the above embodiment can determine an initial load threshold so that the timing at which the primary node synchronizes data for the first time can be determined.
  • the load value of the primary node includes a processor load value and a memory load value
  • the load threshold of the primary node includes a processor load threshold and a memory load threshold
  • the relationship between the processor load value and the processor load threshold may be compared first, and then the relationship between the memory load value and the memory load threshold may be compared, or the relationship between the memory load value and the memory load threshold may be compared first. Then compare the relationship between the processor load value and the processor load threshold, so that the timing of data synchronization between the active and standby nodes can be flexibly determined.
  • a data synchronization processing apparatus which is applied to a simulator of a master node in a computer system, the simulator is used to simulate a hardware device of a first virtual device of a master node, and the computer system further includes a connection with the master node.
  • the device includes:
  • a first thread control unit configured to acquire first to-be-processed information, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet, where the first thread control unit For executing a non-thread-safe code; and writing the first pending information to the buffer module;
  • a second thread control unit configured to perform a consistency negotiation process on the first to-be-processed information, where the consistency negotiation process is used to synchronize the order in which the primary node and the standby node process the first data packet;
  • the first thread control unit is further configured to process the first data packet according to a result of the second thread control unit performing the consistency negotiation process.
  • the data synchronization processing device can execute the code by the first thread control unit and the second thread control unit to complete the corresponding task.
  • the first thread control unit is configured to execute the non-thread-safe code. Therefore, the first thread control unit needs to occupy the mutex when performing the operation. For example, the first thread control unit needs to occupy the global mutual exclusion before acquiring the first to-be-processed information.
  • the method for obtaining the first to-be-processed information by the first thread control unit is not limited in this application.
  • the first thread control unit After acquiring the first to-be-processed information, the first thread control unit writes the first to-be-processed information to the buffer module, where the buffer module may be a buffer queue, or may be a heap or a stack for buffering the first to-be-processed information. (stack), which may be other data structures for buffering the first to-be-processed information, which is not limited in this application.
  • the buffer module may be a buffer queue, or may be a heap or a stack for buffering the first to-be-processed information. (stack), which may be other data structures for buffering the first to-be-processed information, which is not limited in this application.
  • the global mutex can be released, and other threads can occupy the global mutex and schedule the virtual machine to perform other tasks.
  • the second thread control unit reads at least one to-be-processed information in the buffer module, and determines a common order in which the active and standby nodes process the data packets based on the consistency negotiation protocol, and then the first thread control unit occupies the global mutual exclusion lock and follows the second The processing sequence determined by the thread control unit processes the data packet. Since the work of the consistency negotiation between the active and standby nodes is performed by the second thread control unit, the second thread control unit does not need to occupy the global mutex when working. Therefore, the data synchronization processing device may be the master node performing the active and standby virtual When the machine is synchronized, the main virtual machine is used to process other tasks, which improves the performance of the master node.
  • the second thread control unit is specifically configured to:
  • the first to-be-processed information is written to the pipeline according to the processed order of the first data packet, and the pipeline is used by the first thread control unit to read the first to-be-processed information.
  • the first data packet may be a data packet obtained from the client, or may be a data packet generated by the master node, or may be other data packets.
  • the specific content of the first data packet is not limited in this application. Since some program code of the master node is non-thread-safe, the second thread control unit can not directly call the program code of the master node as a worker thread.
  • the consistency negotiation processing scheme provided in this embodiment is in the first thread control unit and A pipeline for establishing a relationship is established between the second thread control units, and the second thread control unit writes the result of the consistency negotiation to the pipeline, so that the first thread control unit reads the result of the consistency negotiation through the pipeline, so that Consistency negotiation is completed while avoiding the impact on the security of the primary node.
  • the second thread control unit is further configured to: read the first to-be-processed information from the buffer module at a preset time.
  • the preset time is, for example, a time corresponding to the timer event
  • the second thread control unit may read the first to-be-processed information from the buffer module based on the trigger of the timer event, and the master node may set different timers.
  • the event therefore, the above embodiment can flexibly trigger the second thread control unit to perform the consistency negotiation process.
  • the second thread control unit is further configured to: obtain exclusive rights of the buffer module, and the exclusive permission of the buffer module is used to prohibit two or more threads. Accessing the buffer module at the same time;
  • the second thread control unit is further configured to: when the number of pieces of information to be processed in the buffer module is 0, release the exclusive permission of the buffer module acquired by the second thread.
  • the second thread control unit When the second thread control unit starts to work, it first occupies the exclusive right of the buffer module, which may also be called a queue mutex lock, for prohibiting two or more thread control units from accessing the buffer module at the same time. When the number of pieces of information to be processed in the buffer module is 0, the second thread control unit releases the queue mutex, and other threads may continue to write new pending information to the buffer module.
  • the foregoing embodiment can prevent the new pending information from being inserted into the to-be-processed information queue that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
  • the second thread control unit is further configured to:
  • the data packet corresponding to the to-be-processed information is written into the consistency log, and the to-be-processed information is deleted.
  • the consistency log is used to cache the data packet corresponding to the to-be-processed information, and the data in the consistency log.
  • the sequence of the packets corresponds to the processed sequence of the data packets in the consistency log, the information to be processed includes the first to-be-processed information, and the data packet corresponding to the to-be-processed information includes the first data packet.
  • a negotiation completion message is received, the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
  • the consistency negotiation process is performed, and then the to-be-processed information in the buffer module is deleted, so that the indication information in the buffer module read by the second thread control unit is new.
  • the information to be processed prevents the second thread control unit from reading the processed information to be processed, thereby improving the efficiency of the consistency negotiation process.
  • the first thread control unit is further configured to: obtain exclusive rights of the buffer module, and the exclusive permission of the buffer module is used to prohibit two or more threads from being in the same Access the buffer module at any time;
  • the first thread control unit is further configured to: release the exclusive permission of the buffer module acquired by the first thread control unit.
  • the first thread control unit first occupies the exclusive permission of the buffer module before writing to the buffer module, and the exclusive authority may also be referred to as a queue mutex lock, for prohibiting two or more thread control units from accessing the buffer at the same time. Module.
  • the second thread control unit can occupy the queue mutex lock and read the pending information in the buffer module.
  • the foregoing embodiment can prevent the new pending information from being inserted into the queue of the information to be processed that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
  • the first virtual device runs a primary database
  • the standby node is configured with a second virtual device
  • the second virtual device runs a standby database
  • the first data packet carries the client for sending to the primary node for the primary database.
  • the first thread control unit is further configured to: obtain first to-be-processed information from the physical network card of the primary node; send the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node process the same in the same order. A packet of data.
  • the device further includes a third thread control unit, and the third thread control unit is configured to:
  • the master node of the n synchronization operations dirty pages same proportion standby node is w 1, ..., w n, where, c 1 and w 1 corresponds, ..., c n and w n corresponding to, n is a positive integer equal to or greater than 2;
  • L m is the load value of the primary node at the current time
  • a synchronization request is generated, the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node;
  • the second thread control unit is also specifically used to:
  • the first thread control unit is also specifically used to:
  • the synchronization request is processed according to the result of performing the consistency negotiation process on the synchronization request.
  • the load threshold used by the device for data synchronization is a dynamic and more preferable threshold, and the virtual machine resource utilization ratio and the same dirty page ratio of the active and standby nodes can reach a better balance point when data synchronization is performed. .
  • the third thread control unit is further configured to:
  • SUM k is the sum of the load value obtained from the first load measurement of the master node to the load value obtained by measuring the kth load, and k is a positive integer;
  • the above embodiment can determine an initial load threshold so that the timing at which the primary node synchronizes data for the first time can be determined.
  • the load value of the primary node includes a processor load value and a memory load value
  • the load threshold of the primary node includes a processor load threshold and a memory load threshold
  • the relationship between the processor load value and the processor load threshold may be compared first, and then the relationship between the memory load value and the memory load threshold may be compared, or the relationship between the memory load value and the memory load threshold may be compared first. Then compare the relationship between the processor load value and the processor load threshold, so that the timing of data synchronization between the active and standby nodes can be flexibly determined.
  • a data synchronization processing apparatus having the functionality of an execution device implementing the method of the first aspect, comprising means for performing the steps or functions described in the above method aspects (means ).
  • the steps or functions may be implemented by software, or by hardware (such as a circuit), or by a combination of hardware and software.
  • the above apparatus includes one or more processing units and one or more communication units.
  • the one or more processing units are configured to support the apparatus to implement a corresponding function of the execution device of the above method, for example, acquiring the first pending information by the first thread.
  • the one or more communication units are configured to support the device to communicate with other devices to implement receiving and/or transmitting functions. For example, the first packet is obtained from the client.
  • the above apparatus may further comprise one or more memories for coupling with the processor, which store program instructions and/or data necessary for the device.
  • the one or more memories may be integrated with the processor or may be separate from the processor. This application is not limited.
  • the device can be a chip.
  • the communication unit may be an input/output circuit or an interface of the chip.
  • the above apparatus includes a transceiver, a processor, and a memory.
  • the processor is for controlling a transceiver or an input/output circuit for transmitting and receiving signals, the memory for storing a computer program for executing a computer program in the memory, such that the device performs either of the first aspect or the first aspect Possible methods in the implementation.
  • a computer system comprising the primary node and the standby node according to the first aspect, wherein the primary node is configured to perform the first aspect or any of the possible implementations of the first aspect Methods.
  • a fifth aspect a computer readable storage medium for storing a computer program, the computer program comprising instructions for performing the method of the first aspect or any of the possible implementations of the first aspect.
  • a computer program product comprising: computer program code, when the computer program code is run on a computer, causing the computer to perform any of the first aspect or the first aspect described above Possible methods in the implementation.
  • Figure 1 is a schematic illustration of a computer system suitable for use with the present application
  • FIG. 2 is a schematic diagram of virtual machine state replication suitable for use in the present application
  • FIG. 3 is a schematic diagram of a data synchronization processing method provided by the present application.
  • FIG. 4 is a schematic diagram of data synchronization between a primary and secondary virtual machine provided by the present application.
  • FIG. 5 is a schematic diagram of a method for determining a timing of data synchronization between a primary and a secondary virtual machine according to the present application
  • FIG. 6 is a schematic diagram of a method for determining an initial load threshold provided by the present application.
  • FIG. 7 is a schematic diagram of another data synchronization processing method provided by the present application.
  • FIG. 8 is a schematic diagram of a consistency negotiation method provided by the present application.
  • FIG. 9 is a schematic diagram of still another data synchronization processing method provided by the present application.
  • FIG. 10 is a schematic diagram of still another data synchronization processing method provided by the present application.
  • FIG. 11 is a schematic structural diagram of a data synchronization processing apparatus provided by the present application.
  • FIG. 12 is another schematic structural diagram of a master node provided by the present application.
  • Figure 1 shows a schematic diagram of a computer system suitable for use in the present application.
  • the computer system 100 includes a host 1 and a host 2.
  • the host 1 includes a hardware platform and a host operating system installed on the hardware platform.
  • the host 1 further includes a virtual machine running on the host operating system. 1 and a quick emulator (Qemu) 1, in which a database 1 is running on the virtual machine 1. .
  • Qemu emulation hardware devices are provided for use by virtual machines.
  • Qemu can monitor the workload of virtual machines running on Qemu.
  • the workload of virtual machines includes the occupancy of virtual machines for central processing units (CPUs) and virtual machine disk usage. , where the central processor and disk are set in the hardware platform.
  • the host 2 includes a hardware platform and a host operating system installed on the hardware platform.
  • the host 2 further includes a virtual machine 2 and a Qemu 2 running on the host operating system, wherein the virtual machine 2 runs a database 2.
  • the database 1 is the primary database
  • the database 2 is the standby database.
  • the database 2 can be referred to as the primary database instead of the database 1 for the client to access.
  • the virtual machine 1 and the virtual machine 3 can be mutually active virtual machines.
  • the host 1 and the host 2 are mutually active and standby nodes.
  • Host 1 and host 2 can communicate with each other through a network interface card (NIC) and can communicate with the client separately.
  • NIC network interface card
  • host 1 is the master node and host 2 is the standby node
  • virtual machine 1 can process the four data packets in the order of 1234
  • the master node 1 The order of the processing of the four data packets by the virtual machine 3 is determined by the consistency negotiation module of the Qemu1 and the Qemu3 consistency negotiation module in the host 2 to be 1234, so that the virtual machine 1 and the virtual machine 3 process the four data packets.
  • the order is the same, so the primary node and the standby node have only a small amount of memory and dirty pages, and only need to transfer less data when synchronizing.
  • the consistency negotiation module can implement the consistency negotiation of the data packet processing order by using the paxos algorithm.
  • the observer node is further introduced in FIG. 1, wherein the observer node may include a consistency negotiation.
  • the module's Qemu is described in more detail below.
  • the above computer system 100 is merely an example, and the computer system applicable to the present application is not limited thereto.
  • the computer system 100 may further include other hosts.
  • different hosts can communicate via radio waves or communicate over Ethernet.
  • FIG. 2 is a schematic diagram of a virtual machine state replication provided by the present application.
  • the Paxos negotiation module (that is, the consistency negotiation module) is deployed in the Qemu of the active and standby nodes, and all virtual machines run the same database program in parallel.
  • the primary node After the data packets from the client reach the primary node, the primary node The Paxos module negotiates the processing order of each packet received with the Paxos module of other standby nodes, so that all virtual machines process the same data packet in the same order, so that the standby node and the primary node have only a small amount of memory dirty pages. Inconsistent, so that only a small amount of data can be transferred during synchronization to complete the synchronization, which improves the efficiency of synchronization.
  • the primary node and the standby node run the same database, and the shaded memory dirty pages (also referred to as "dirty pages”) represent dirty pages of virtual machine 2 that differ from virtual machine 1.
  • Paxos negotiation module shown in FIG. 2 is merely an example, and other consistency algorithms are also applicable to the present application.
  • FIG. 3 shows a flow chart of a data synchronization processing method 300 provided by the present application.
  • the method 300 is applied to a master node in a computer system. Specifically, the method 300 is applied to a Qemu1 of a master node in a computer system.
  • the computer system further includes a standby node connected to the master node, and the method 300 includes:
  • the first to-be-processed information is obtained by the first thread, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet, where the first thread is a non-threading The thread of the security code.
  • the first indication information may be, for example, indication information including a pointer and a data size
  • the pointer is used to indicate a storage address of the first data packet
  • the first thread may be read by the pointer at the storage address of the first data packet.
  • the data size information is used to obtain the first data packet.
  • the buffer module is a buffer queue, and may also be a heap or a stack for buffering the first to-be-processed information, and may also be another data structure for buffering the first to-be-processed information. This application does not limit this.
  • the first data packet is processed by the first thread according to the result of the consistency negotiation process.
  • the master node may mobilize the first thread and the second thread to execute the code to complete the corresponding task.
  • the above behavior is sometimes described as “complete the task by the first thread” or “first”.
  • the thread completes the task, for example, "obtaining the first to-be-processed information by the first thread” and "the first thread acquiring the first to-be-processed information” can be understood as the master node scheduling the first thread to execute the code to obtain the first to-be-processed Process information.
  • the first thread is a thread that executes non-thread-safe code.
  • the first thread is Qemu's main loop thread (Qemu main loop), which executes Qemu's core code, is a dedicated event processing loop thread, and the main loop thread is based on The state change of the file descriptor calls the corresponding handler to handle the event, and the second thread is Qemu's Worker thread.
  • Qemu's core code is non-thread-safe, that is, Qemu does not provide data access protection. It is possible that multiple threads of Qemu change the same data to cause inconsistency. Therefore, the first thread needs to occupy each other when performing operations.
  • a mutex for example, the main loop thread needs to occupy a global mutex before acquiring the first to-be-processed information, and release the global mutex after writing the first to-be-processed information to the buffer module, thereby ensuring At the same time, only the main loop thread occupying the global mutex can perform the operation of acquiring the first to-be-processed information and writing the first to-be-processed information to the buffer module.
  • the first to-be-processed information is any information to be processed obtained by the master node, and the first to-be-processed information may be a data packet, or may be a descriptor for indicating the data packet (ie, indication information). .
  • the master node may directly write the data packet to the buffer module, or may generate a descriptor indicating the data packet, and write the descriptor to the buffer module, where the descriptor is A pointer to the packet can be included, along with information indicating the length and type of the packet.
  • the first data packet may also be a data packet generated locally by the primary node.
  • the application does not limit the specific content of the first data packet and the method for the primary node to acquire the first data packet.
  • the first thread acquires the first to-be-processed information
  • the first to-be-processed information is written to the buffer module.
  • the second thread reads at least one to-be-processed information in the buffer module, and determines a common order in which the active and standby nodes process the data packets based on a consistency negotiation protocol (eg, Paxos), and then the first thread occupies the global mutex and follows the second The processing sequence determined by the thread processes the packet.
  • the second thread is, for example, a consensus negotiation thread.
  • the first thread may process the first data packet according to the type of the first data packet. For example, when the first data packet is a data packet sent by the client, the first thread may send the first data packet to the primary node.
  • the virtual machine performs processing.
  • the primary node may perform the synchronous operation of the active and standby nodes according to the request data packet.
  • the master node can perform the active and standby virtual
  • the synchronous operation of the machine utilizes the virtual machine of the primary node to process other tasks, improving the performance of the primary node.
  • the database 1 and the database 2 can be guaranteed to perform the same order of access, thereby minimizing the difference of the dirty pages of the active and standby nodes, and reducing the main The number of dirty pages that need to be transferred when preparing for synchronization.
  • S330 includes:
  • the first pending information is read from the buffer module by using the second thread.
  • S333 Write, by the second thread, the first to-be-processed information to the pipeline according to the processed order of the first data packet, where the pipeline is used by the first thread to read the first to-be-processed information.
  • the consistency negotiation processing scheme provided in this embodiment is in the first thread and the second thread. Build a pipe for the connection between the threads, and add the pipe to the event loop list of the Qemu main loop thread.
  • the second thread executes on the file descriptor.
  • a write operation causes the file descriptor to become readable at the end of the Qemu main loop thread. After the Qemu main loop thread reads the file descriptor, the corresponding program can be called to perform subsequent processing.
  • the second thread executes the virtual network card processing code (RTL8139_do_receiver) to perform the first data packet.
  • RTL8139_do_receiver the virtual network card processing code
  • the logical operation of the virtual network card however, the processing code of the RTL8139 virtual network card is a non-linear security code.
  • the second thread can write the descriptor of the first data packet into the pipeline, and in the description A write operation is performed to make the file descriptor readable at the end of the Qemu main loop thread.
  • the virtual network card processing code is called to perform subsequent processing on the first data packet. Therefore, the foregoing embodiment can complete the processing task after the consistency negotiation under the premise of ensuring the thread security of the master node.
  • S331 includes:
  • S3311 The first to-be-processed information is read from the buffer module by using the second thread at a preset time.
  • the preset time is, for example, a time corresponding to the timer event
  • the second thread may read the first to-be-processed information from the buffer module based on the trigger of the timer event, and the master node may set different timer events. Therefore, the foregoing embodiment can flexibly trigger the second thread to perform the consistency negotiation process.
  • the method 300 further includes:
  • S3301 Obtain exclusive permission of the buffer module by using the second thread, and the exclusive permission of the buffer module is used to prohibit two or more threads from accessing the buffer module at the same time.
  • method 300 further includes:
  • the second thread when the second thread starts working, first obtain the exclusive permission of the buffer queue, which may also be called a queue mutex, for prohibiting two or more threads from accessing at the same time (including write and / or read) buffer queue.
  • the second thread releases the queue mutex, and other threads can continue to write new pending information to the buffer queue.
  • the foregoing embodiment can prevent the new pending information from being inserted into the to-be-processed information queue that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
  • S332 includes:
  • the data packet corresponding to the to-be-processed information (including the first data packet) is written into the consistency log by the second thread, and the to-be-processed information in the buffer module is deleted.
  • the sex log is used to cache the data packets, and the order of the data packets in the consistency log corresponds to the processing order of the data packets in the consistency log.
  • S3323 Send, by using the second thread, a consistency negotiation request that includes the first data packet, where the consistency negotiation request is used to request the standby node to accept the processed sequence of the first data packet.
  • the negotiation completion message is received by the second thread, where the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
  • the second thread When a timer event or an I/O event is triggered, the second thread first occupies the queue mutex and then checks to see if the buffer queue is empty. If the buffer queue is empty, the queue mutex is released; if the buffer queue is not empty, the second thread sequentially reads the members in the queue (data packet or packet descriptor), and inserts the packet corresponding to the member into the Paxos protocol. The consistency log, then removes the member from the queue and releases the memory occupied by the original packet. The second thread reads until the queue is empty, and then releases the queue mutex.
  • the second thread After the queue mutex is released, the second thread sends the data packets in the consistency log to the standby node in sequence, requesting the standby node to process the data packets in the consistency log according to the sequence, and then, when the second thread receives the data from the standby When the node completes the negotiation message, it is determined that the processed sequence of the data packets in the consistency log has been accepted by the standby node.
  • the second thread executes the consistency negotiation process after reading the to-be-processed information, and then deletes the to-be-processed information in the buffer module, so that the indication information in the buffer module that the second thread reads each time can be ensured. It is unprocessed information to be processed, and the second thread is prevented from reading the processed information to be processed, thereby improving the efficiency of the consistency negotiation process.
  • the method 300 before S320, the method 300 further includes:
  • S319 Acquire exclusive permission of the buffer module by using the first thread, and the exclusive permission of the buffer module is used to prohibit two or more threads from accessing the buffer module at the same time.
  • method 300 further includes:
  • S321 Release the exclusive permission of the buffer module acquired by the first thread.
  • the exclusive permission may also be called a queue mutual exclusion lock, which is used to prohibit two or more threads from accessing the buffer module at the same time.
  • the second thread can occupy the queue mutex lock and read the pending information in the buffer module.
  • the first virtual device runs a primary database
  • the standby node is configured with a second virtual device
  • the second virtual device runs a standby database, where the first data packet carries the client and sends the data to the primary node. Access requests to the primary database
  • S310 includes: acquiring, by the first thread, the first to-be-processed information from the physical network card of the primary node.
  • S340 includes: transmitting, by the first thread, the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node process process the first data packet in the same order.
  • the first data packet sent by the client reaches the Qemu of the master node through the physical network card of the master node. After the consistency negotiation process of the master node, the first data packet is processed by the master node and the standby node in the same processing order, thereby improving The same dirty page ratio of the active and standby nodes.
  • the method 300 further includes:
  • L m is the time value of the current load of the master node.
  • the synchronization request is processed by the first thread according to a result of performing a consistency negotiation process on the synchronization request.
  • the master node determines whether to synchronize the synchronization between the active and standby nodes according to the current load value and the fixed load threshold. If the current load value is less than the fixed load threshold, the synchronization between the active and standby nodes is not started. When the load value at the current time is greater than or equal to the fixed load threshold, the synchronization of the active and standby nodes is started.
  • the above prior art has the disadvantage that it is difficult to determine the optimal starting timing for starting the synchronization of the active and standby nodes according to the fixed load threshold, because if the fixed load threshold is set too small, for example, the fixed load threshold is set to 0, although When the load value of the master node meets the condition, the ratio of the same dirty pages of the active and standby nodes is the highest (because the virtual machines of the active and standby nodes are no longer working, the dirty pages are no longer changed), but the virtual machines of the primary and secondary nodes are loaded. The virtual machine resources of the active and standby nodes are wasted when the time between the detection and the data synchronization is idle.
  • the virtual machines of the active and standby nodes are still working when the data is synchronized.
  • the virtual machine of the active and standby nodes has a small proportion of the same dirty pages.
  • the active and standby nodes need to transmit more data (that is, data corresponding to different dirty pages), which results in more data synchronization between the active and standby nodes. Internet resources.
  • the working time between the primary virtual machine 1 and the first load detection processor is 10 minutes
  • the virtual machine of the primary and secondary nodes has the same dirty page ratio of 80%
  • the primary virtual machine 1 is started to the second time.
  • the processor working time is 20 minutes between load detection
  • the same dirty page ratio of the virtual machine of the active and standby nodes is 85%
  • the processor working time between the primary virtual machine 1 and the third load detection is 20 minutes.
  • the virtual machine of the standby node has the same dirty page ratio of 85%.
  • the above data indicates that the virtual machine 1 of the primary node is already in an idle state at least during the second load detection.
  • the virtual machine 1 of the primary node is already in an idle state before the second load detection, and if the second load is detected. After the data synchronization is started, the virtual machine 1 of the primary node is idle for a period of time, and the virtual machine resources are wasted. Therefore, the preferred synchronization time of the primary and secondary nodes is after the first load detection and before the second load detection. During this time period, when the virtual machine 1 of the master node has completed most of the work or all the work, the master-slave synchronization starts, which can obtain a better balance point between the virtual machine resource utilization and the same dirty page ratio.
  • the load threshold is determined according to the load threshold of the primary node and the same dirty page ratio at least two synchronization operations, for example, the same dirty page ratio 80% obtained at the first load detection is used as the weight value. Multiply the load threshold value 5 to obtain the result 4, and the same dirty page ratio 85% obtained at the second load detection is multiplied by the load threshold value 6 to obtain the result 5.1, and the sum of 4 and 5.1 is divided by the load detection number 2 to obtain The weighted average of the load thresholds for the two load tests is 4.55, which is the new load threshold.
  • the load value of the master node in the third load detection is 2 (the working time 22 obtained by the third load detection minus the work obtained by the second load detection) Time 20 obtains the load value of the master node during the third load detection 2), and the load value is less than the new load threshold of 4.55, indicating that the remaining tasks of the virtual machine of the primary node are not much, and the virtual machine of the primary node will soon be
  • the data synchronization operation is started; if the processor working time obtained by the third load detection is 30, the load value of the primary node is 10 when the third load is detected (the working time obtained by the third load detection) 30 minus the working time 20 obtained by the second load detection to obtain the load value of the master node in the third load detection 2), the load value is greater than the new load threshold 4.55, indicating that the remaining tasks of the virtual machine of the master node are still Many, if the proportion of the same dirty page of the active and standby nodes is small,
  • the new load threshold in this embodiment is a weighted average determined according to the result of the multiple load measurement, the new load threshold will gradually converge to a more preferable load threshold as the number of load detections increases.
  • the third thread is, for example, a worker thread of the active/standby synchronization module in the master node, that is, a thread responsible for synchronization of the active and standby virtual machines.
  • the load threshold is a dynamic and preferable threshold, and the virtual machine resource utilization ratio and the same dirty page ratio of the active and standby nodes are better when data synchronization is performed. The balance point.
  • the method 300 further includes:
  • S3001 Acquire, by the third thread, SUM k , SUM k is the sum of the load value obtained by the first load measurement of the primary node to the load value obtained by the kth load measurement, and k is a positive integer.
  • T count is the load measurement threshold
  • c 0 is the load threshold of the first synchronization operation of the primary node
  • c 0 SUM k ⁇ k.
  • the number of measurements (COUNT) of the load is equal to 0.
  • the first load value L 1 is obtained
  • SUM 1 is equal to L 1
  • the measurement number threshold T count is 2
  • the initial load threshold c 0 is equal to SUM 2 divided by 2 that is, the initial load threshold is positively correlated with SUM 2
  • the initial load threshold is negatively correlated with the number of measurements; if the measurement number threshold T count is 3.
  • SUM 1 is equal to the load value obtained by the first load measurement.
  • the above embodiment can determine an initial load threshold so that the timing at which the primary node synchronizes data for the first time can be determined.
  • the load value of the primary node includes a processor load value and a memory load value
  • the load threshold of the primary node includes a processor load threshold and a memory load threshold
  • the relationship between the processor load value and the processor load threshold may be compared first, and then the relationship between the memory load value and the memory load threshold may be compared, or the relationship between the memory load value and the memory load threshold may be compared first. Then compare the relationship between the processor load value and the processor load threshold, so that the timing of data synchronization between the active and standby nodes can be flexibly determined.
  • the primary virtual machine is a virtual machine running on the primary node
  • the standby virtual machine is a virtual machine running on the standby node.
  • the primary and secondary virtual machines are synchronized, that is, the data of the active and standby nodes is synchronized. among them,
  • T0-T1 The primary virtual machine runs with the standby virtual machine and records a list of dirty pages.
  • T1-T2 The primary virtual machine and the standby virtual machine stop running, and each computes a hash value of the dirty page.
  • T2-T3 The primary virtual machine compares the hash value of the dirty page of the standby virtual machine.
  • T3-T4 The primary virtual machine transfers the different dirty pages to the backup virtual machine.
  • the primary virtual machine releases the buffered network output (different dirty page data) and resumes operation, and the standby virtual machine resumes operation.
  • T1 is a time for performing synchronization between the active and standby virtual machines.
  • FIG. 5 shows a method flow for triggering synchronization of the active and standby virtual machines.
  • the method 500 includes:
  • the synchronization module of the active and standby virtual machines records the same memory dirty page ratio of the primary virtual machine and the standby virtual machine at each synchronization, and the CPU load threshold and disk (input/output, I/O) load when the primary and backup virtual machines are synchronized. Threshold.
  • the corresponding weight is given to the threshold according to the same dirty page ratio. Then multiply the last n load thresholds by their weights, sum the total values, and divide by n to get the load threshold for triggering the synchronization of the active and standby virtual machines next time.
  • the primary and secondary active and standby synchronization modules of the primary virtual machine are given the CPU threshold according to the same dirty page ratio w j when the jth primary and secondary virtual machines are synchronized. c j corresponding weight w j . Multiply the CPU thresholds of the last n times by the corresponding weights, then sum and get the overall value, and divide by n to get the load value that the CPU needs to achieve when starting the j+1th primary and backup virtual machine synchronization, that is, The adjustment process of the disk I/O load threshold is the same.
  • the initial load accumulated value SUM 0 0
  • the initial load value CPU_Tick_A0 is equal to 0
  • the count value is the number of load measurements performed by the primary virtual machine.
  • S530 Acquire a current load value, compare the current load value with a set load threshold, and determine whether synchronization is performed.
  • the active/standby synchronization module obtains the workload of the virtual machine, compares it with the threshold, and starts synchronization.
  • the process is as follows:
  • the master node is responsible for the synchronization of the active and standby virtual machines (that is, the synchronization thread) calls the clock function to obtain the CPU_Tick 1 time from the startup to the CPU at the first moment.
  • ⁇ t 1 is a value set in advance by the master node. In order to be able to detect that the virtual machine is idle in a short time and avoid the error caused by the monitoring time being too short, ⁇ t 1 can be set, for example, to 100 microseconds.
  • the thread that is responsible for the synchronization of the active and standby virtual machines calls the clock function again to get the CPU_Tick 2 time taken by the virtual machine from the startup to the second time.
  • CPU_Tick 1 -CPU_Tick 1 ⁇ c it means that the CPU is idle, go to step 5, otherwise the synchronous thread sleeps ⁇ t 1 and then continue to call the clock function to get the time that the virtual machine takes up the CPU from the startup to the current time, until the current CPU The difference between the occupied time minus the last CPU usage time is less than the CPU load threshold, where c is the CPU load threshold that is required to trigger the active and standby virtual machines to synchronize the CPU.
  • the synchronization thread obtains the time disk_time 1 of the virtual machine from the startup to the current disk I/O through the Linux Netlink interface.
  • ⁇ t 2 is a value set in advance by the master node. ⁇ t 2 is determined according to the performance of the physical disk. For example, if the disk I/O operation takes 5 milliseconds, ⁇ t 2 can be set to 5 milliseconds.
  • disk_time 2 -disk_time 1 ⁇ d the disk I/O is idle, start the master-slave synchronization, otherwise the synchronization thread sleeps ⁇ t 2 and continues to get the disk I/O from the boot to the current time through the Linux Netlink interface. Time, until the current disk I/O time minus the previous disk I/O time difference (that is, the current disk I/O load value) is less than the disk I/O load threshold, where d is the trigger master and backup The disk load threshold that the disk needs to reach when the virtual machine synchronizes.
  • the above process first determines whether the CPU load exceeds the CPU load threshold, and then determines whether the disk I/O load exceeds the disk I/O load threshold. As an optional example, it can also determine whether the disk I/O load exceeds the disk I/O. Load threshold, and then determine whether the CPU load exceeds the CPU load threshold. In addition, if there are other parameters that affect the same dirty page ratio of the active and standby virtual machines, you can also determine whether to synchronize the active and standby virtual machines according to the above method.
  • the master-slave synchronization module When the synchronization is initiated, the master-slave synchronization module generates a special packet descriptor containing a pointer to a null address, which is also used to indicate the length (zero) of the packet and the type of the packet. The synchronization module occupies the mutex of the buffer queue and inserts the packet descriptor into the buffer queue, then releases the queue mutex.
  • the primary virtual machine transfers inconsistent dirty pages to the standby virtual machine by comparing the dirty pages of the active and standby virtual machines.
  • FIG. 7 shows another flow chart of the data synchronization processing method provided by the present application.
  • the terminal access point (TAP) string device (/dev/tapX) of the master node becomes readable.
  • TAP terminal access point
  • the Qemu main loop thread finds that the TAP string device is readable, it attempts to occupy the global mutex and read out the client packet from the string device.
  • the Qemu main loop thread then generates a descriptor for the packet, the descriptor including a pointer to the packet, and information describing the length of the packet and the type of the packet, wherein the packet type is a client Request.
  • the primary node occupies the mutex of the buffer queue and populates the packet descriptor into the buffer queue, then releases the queue mutex.
  • the middle-tier module is responsible for the consistency negotiation thread occupying the mutex of the buffer queue, and then checking whether the buffer queue is empty. If the buffer queue is not empty, the members in the queue (ie, descriptors) are read in turn, the packets described by the members are filled into the consistency log of the Paxos protocol, and then the members are deleted from the queue and the original data packets are released. Occupied memory.
  • the thread responsible for the consistency negotiation reads until the queue is empty, and then releases the mutex of the queue. After the queue mutex is released, the thread responsible for the consistency negotiation checks whether there are members waiting to be processed (not negotiated) in the consistency log of the Paxos protocol. If so, the member negotiates with other nodes according to the Paxos algorithm.
  • the active/standby synchronization module determines the timing of the synchronization between the active and standby virtual machines according to the methods shown in FIG. 5 and FIG. 6.
  • the primary and secondary synchronization modules When the active/standby synchronization module determines to trigger the synchronization of the active and standby virtual machines, the primary and secondary synchronization modules generate an active/standby synchronization request, and After the mutex that occupies the buffer queue, the packet is inserted into the buffer queue, and then the queue mutex is released. Both the active and standby synchronization requests and the data from the client must be negotiated in a consistent manner before they can be processed.
  • the thread responsible for the consistency negotiation is determined by the Paxos algorithm.
  • the descriptor of the data packet is written to the pipeline, so that the Qemu main loop thread reads the descriptor from the pipeline.
  • the packet is processed according to the type of the packet indicated by the descriptor. For example, when the packet is a packet from the client, the packet is virtualized. The NIC is sent to the virtual machine for processing.
  • FIG. 8 is a schematic diagram of a consistency negotiation method provided by the present application.
  • the distributed system shown in FIG. 8 includes an observer node in addition to the primary node and the standby node, so that the requirements of the Paxos algorithm can be satisfied, and the observer node can also be replaced with Standby node.
  • a distributed system suitable for the present application may also include more standby nodes.
  • the primary and standby virtual machines are in hot standby and run the same distributed database program in parallel.
  • the observer node virtual machine is in a standby state.
  • the Qemu of the three nodes all have a consistency negotiation module, and the client network request and the master-slave synchronization request are negotiated according to the Paxos algorithm. Observer nodes only participate in Paxos negotiation work and do not participate in active/standby synchronization.
  • the middle layer software module is responsible for the consistency negotiation thread based on the network I/O event triggered by the Paxos algorithm message delivery.
  • the thread responsible for the consistency negotiation receives the negotiation message sent from other nodes, it processes according to the Paxos algorithm.
  • the thread responsible for the consistency negotiation determines that after the data packet has been consistently negotiated, if the data packet is a client request, the data packet is sent to the virtual machine, and if the data packet is an active/standby synchronization request, it is responsible for consistency negotiation.
  • the thread notifies the active and standby synchronization modules to initiate synchronization.
  • the thread responsible for consistency negotiation of the middle layer software module is also based on the network I/O event triggered by the Paxos algorithm message delivery.
  • the thread responsible for the consistency negotiation receives the negotiation message sent from other nodes, it processes according to the Paxos algorithm. Since the observer node virtual machine is in the standby state, after the thread responsible for the consistency negotiation determines that the data packet has completed the consistency negotiation, the data packet that is negotiated is either the client request or the active/standby synchronization request, and is discarded.
  • FIG. 9 is still another flowchart of the data synchronization processing method provided by the present application.
  • the master node When the client data packet arrives at the physical network card of the master node, the master node (ie, the host operating system) invokes the driver of the physical network card, in which the software bridge in the Linux kernel is utilized to implement data forwarding. On the software bridge layer, the master node will determine which device the packet is sent to, and at the same time call the bridge's send function to send the packet to the corresponding port number. If the packet is destined for the virtual machine, it is forwarded through the TAP device.
  • the TAP is equivalent to an Ethernet device that operates on Layer 2 packets, the Ethernet data frame.
  • the character device (/dev/tapX) of the TAP device is responsible for forwarding packets in kernel space and user space.
  • the Qemu main loop thread keeps looping through the "select system call" function to determine which file descriptors have changed state, including the state of the TAP device file descriptor and pipe device file description.
  • the Qemu main loop thread finds that the TAP string device is readable, it attempts to occupy the global mutex and read out the client packet from the string device.
  • the Qemu main loop then generates a descriptor for the packet, the descriptor containing a pointer to the packet, and information indicating the length of the packet and the type of the packet, where the type of the packet is the client data pack.
  • the master node initiates the master-slave synchronization to generate a synchronization request packet.
  • An automatic threshold adjustment algorithm is deployed in the active/standby synchronization module of the primary node Qemu (as shown in S301 to S304).
  • the active/standby synchronization module of the master node monitors the CPU load and disk I/O load of the virtual machine, and compares the load threshold and the virtual machine load to determine whether to initiate synchronization.
  • the master-slave synchronization module When the master node initiates synchronization, the master-slave synchronization module generates a special packet descriptor containing a pointer to the null address, and also information indicating the length (zero) of the packet and the type of the packet.
  • the type of the packet here is the primary and secondary synchronization request.
  • the synchronization module occupies the mutex of the buffer queue and populates the packet descriptor into the buffer queue, then releases the queue mutex.
  • the primary virtual machine synchronizes, it compares the dirty pages of the active and standby virtual machines, and only transmits the inconsistent dirty pages to the standby virtual machine.
  • S3 The master node inserts the packet descriptor into the buffer queue and performs consistency negotiation on the data packet.
  • Figure 10 consists of two parts, one part is the processing flow of the Qemu main loop thread.
  • the processing flow consists of three steps, namely the mutex that holds the buffer queue, fills the packet descriptor into the buffer queue, and then releases the queue mutually exclusive. lock.
  • the middle-tier thread responsible for consistency negotiation is driven based on events (timer events or network I/O events). For example, when a timer event is triggered, the consistency negotiation thread first occupies the mutex of the buffer queue, and then checks to see if the buffer queue is empty. If the buffer queue is not empty, the consistency negotiation thread reads the members in the queue in turn, inserts the packet described by the member into the consistency log of the Paxos protocol, and then removes the member from the queue and releases the memory occupied by the original packet. The consistency negotiation thread reads until the queue is empty, and then releases the mutex of the queue. After the queue mutex is released, the consistency negotiation thread checks whether there are members waiting to be processed (not negotiated) in the consistency log of the Paxos protocol. If so, the members to be processed negotiate with other nodes according to the Paxos algorithm.
  • events timer events or network I/O events. For example, when a timer event is triggered, the consistency negotiation thread first occupies the mutex of the buffer
  • S4 The master node determines the type of the data packet after the negotiation is reached.
  • the consistency negotiation thread needs to listen for network I/O events that are triggered by the received Paxos algorithm message. When the consistency negotiation thread receives the negotiation message sent by other nodes, it needs to be processed according to the Paxos algorithm. If the consistency negotiation thread determines that a data packet has been consistently negotiated by the Paxos algorithm, the type is determined according to the information contained in the data packet, wherein the consistency negotiation thread is in the original data packet (the data packet before the buffer queue is inserted) When the consistency negotiation is performed, the original data packet is encapsulated, and the encapsulated data packet contains other information in addition to the original data packet, and the other information is, for example, information indicating the original data packet type, and the consistency negotiation thread encapsulates the information. The subsequent packets are sent to the standby node.
  • the client data packet is forwarded to the Qemu main loop, and the Qemu main loop performs a logical operation of the virtual network card (such as RTL8139) on the client data packet.
  • the virtual network card such as RTL8139
  • the consistency negotiation thread first writes the length of the packet in the pipe associated with the Qemu main loop and then writes the packet content.
  • the Qemu main loop thread finds that the file descriptor of the pipeline becomes readable, it takes up the global mutex and reads out an integer type of data from the pipeline. This data is the length of the packet sent in the pipeline. . According to the obtained integer, the Qemu main loop thread reads the corresponding length data, that is, the data packet, from the pipeline.
  • the Qemu main loop thread then calls the RTL8139_do_receiver function, which performs the logical operation equivalent to the hardware RTL8139 NIC in this function.
  • the kernel-based virtual machine operates the virtual RTL8139 by analog I/O instructions to copy the packet to the client address space and place it in the corresponding I/O address.
  • the Qemu main loop thread releases the global mutex.
  • S6 The application in the virtual machine processes the client data packet.
  • the database program in the virtual machine performs the query action after receiving the client data packet, and returns the execution result.
  • the consistency negotiation thread notifies the active/standby synchronization module to initiate synchronization.
  • the virtual machine is generated to prepare a synchronization data frame, and the data frame is placed in a buffer queue of the primary node for transmission.
  • the master-slave synchronization module can be implemented by the third thread of Qemu
  • the consistency negotiation layer module can be implemented by the second thread of Qemu, wherein the second thread and the third thread are both Qemu. Worker thread.
  • the master node includes corresponding hardware structures and/or software modules for performing various functions.
  • the present application can be implemented in a combination of hardware or hardware and computer software in combination with the elements and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
  • the present application may divide a functional unit into a master node according to the above method example.
  • each functional unit may be divided according to each function, or two or more functions may be integrated into one processing unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit. It should be noted that the division of the unit in the present application is schematic, and is only a logical function division, and the actual implementation may have another division manner.
  • FIG. 11 is a schematic structural diagram of a possible data synchronization processing apparatus provided by the present application.
  • the data synchronization processing device 1100 may be a software module or a hardware module included in the master node, and the data synchronization processing device 1100 includes a first thread control unit 1101 and a first thread control unit 1102.
  • the first thread control unit 1101 and the first thread control unit 1102 are used to control and manage the actions of the data synchronization processing device 1100.
  • the first thread control unit 1101 and the first thread control unit 1102 are configured to support the data synchronization processing device 1100.
  • the various steps of Figure 3 and/or other processes for the techniques described herein are performed.
  • the first thread control unit 1101 is configured to acquire the first to-be-processed information, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet, where the first thread control
  • the unit 1101 is configured to execute the non-thread-safe code; and write the first to-be-processed information into the buffer module;
  • the second thread control unit 1102 is configured to perform a consistency negotiation process on the first to-be-processed information, where the consistency negotiation process is used to synchronize the order in which the primary node and the standby node process the first data packet;
  • the first thread control unit 1101 is further configured to process the first data packet according to the result of the second thread control unit 1102 performing the consistency negotiation process.
  • the data synchronization processing device 1100 can execute code by the first thread control unit 1101 and the second thread control unit 1102 to complete the corresponding task.
  • the first thread control unit 1101 is configured to execute the non-thread-safe code, and therefore, the first thread control unit 1101 needs to occupy the mutex when performing the operation, for example, the first thread control unit 1101 needs to occupy before acquiring the first to-be-processed information.
  • the global mutex is not limited in this application.
  • the manner in which the first thread control unit 1101 acquires the first to-be-processed information is not limited.
  • the first thread control unit 1101 After acquiring the first to-be-processed information, the first thread control unit 1101 writes the first to-be-processed information to the buffer module, where the buffer module may be a buffer queue, or may be a heap for buffering the first to-be-processed information or The stack is also a data structure for buffering the first to-be-processed information, which is not limited in this application.
  • the global mutex can be released, and other threads can occupy the global mutex and schedule the virtual machine to perform other tasks.
  • the second thread control unit 1102 reads at least one to-be-processed information in the buffer module, and determines a common order in which the active and standby nodes process the data packets based on the consistency negotiation protocol. Subsequently, the first thread control unit 1101 occupies the global mutex and follows The processing sequence determined by the second thread control unit 1102 processes the data packet. Since the work of the consistency negotiation of the active and standby nodes is performed by the second thread control unit 1102, the second thread control unit 1102 does not need to occupy the global mutex when working, and therefore, the master node configured with the data synchronization processing device 1100 is performing. The synchronization process of the active and standby virtual machines utilizes the primary virtual machine to process other tasks, and has higher performance than the primary nodes in the prior art.
  • the second thread control unit 1102 is specifically configured to:
  • the first to-be-processed information is written to the pipeline according to the processed order of the first data packet, and the pipeline is used by the first thread control unit 1101 to read the first to-be-processed information.
  • the first data packet may be a data packet obtained from the client, or may be a data packet generated by the master node, or may be other data packets.
  • the specific content of the first data packet is not limited in this application. Since some program code executed by the data synchronization processing device 1100 is not thread-safe, the second thread control unit 1102 cannot directly call the program code of the data synchronization processing device 1100 as a worker thread, and the consistency negotiation process provided in this embodiment
  • the scheme establishes a pipe for contacting between the first thread control unit 1101 and the second thread control unit 1102, and the second thread control unit 1102 writes the result of the consistency negotiation to the pipeline so that the first thread control unit 1101 passes
  • the pipeline reads the result of the consistency negotiation, so that the consistency of the data synchronization processing device 1100 can be avoided while completing the consistency negotiation.
  • the second thread control unit 1102 is further configured to: read the first to-be-processed information from the buffer module at a preset time.
  • the preset time is, for example, the time corresponding to the timer event
  • the second thread control unit 1102 can read the first to-be-processed information from the buffer module based on the trigger of the timer event, and the master node can set different timings.
  • the event therefore, the above embodiment can flexibly trigger the second thread control unit 1102 to perform the consistency negotiation process.
  • the second thread control unit 1102 is further configured to: obtain exclusive rights of the buffer module, and the exclusive permission of the buffer module is used to prohibit two or more The thread accesses the buffer module at the same time;
  • the second thread control unit 1102 is further configured to: when the number of pieces of information to be processed in the buffer module is 0, release the exclusive right of the buffer module acquired by the second thread.
  • the second thread control unit 1102 When the second thread control unit 1102 starts to work, it first occupies exclusive rights of the buffer module, which may also be called a queue mutex lock, for prohibiting two or more thread control units from accessing the buffer module at the same time. . When the number of pieces of information to be processed in the buffer module is 0, the second thread control unit 1102 releases the queue mutex, and other threads may continue to write new pending information to the buffer module.
  • the foregoing embodiment can prevent the new pending information from being inserted into the to-be-processed information queue that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
  • the second thread control unit 1102 is further specifically configured to:
  • the data packet corresponding to the to-be-processed information is written into the consistency log, and the to-be-processed information is deleted.
  • the consistency log is used to cache the data packet corresponding to the to-be-processed information, and the data in the consistency log.
  • the sequence of the packets corresponds to the processed sequence of the data packets in the consistency log, the information to be processed includes the first to-be-processed information, and the data packet corresponding to the to-be-processed information includes the first data packet.
  • a negotiation completion message is received, the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
  • the consistency negotiation process is executed, and the information to be processed in the buffer module is deleted, so that the instruction information in the buffer module read by the second thread control unit 1102 can be ensured. It is new to-be-processed information, and the second thread control unit 1102 is prevented from reading the processed information to be processed, thereby improving the efficiency of the consistency negotiation process.
  • the first thread control unit 1101 is further configured to: acquire exclusive rights of the buffer module, where the exclusive permission of the buffer module is used to prohibit two or more threads from being Accessing the buffer module at the same time;
  • the first thread control unit 1101 is further configured to: release the exclusive permission of the buffer module acquired by the first thread control unit 1101.
  • the first thread control unit 1101 first occupies exclusive rights of the buffer module before writing to the buffer module, and the exclusive authority may also be referred to as a queue mutex lock for prohibiting two or more thread control units from accessing at the same time. Buffer module.
  • the second thread control unit 1102 can occupy the queue mutex lock and read the pending information in the buffer module.
  • the foregoing embodiment can prevent the new pending information from being inserted into the queue of the information to be processed that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
  • the first virtual device runs a primary database
  • the standby node is configured with a second virtual device
  • the second virtual device runs a standby database
  • the first data packet carries the client for sending to the primary node for the primary database.
  • the first thread control unit 1101 is further configured to: obtain first to-be-processed information from the physical network card of the primary node; send the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node are processed in the same order. The first packet.
  • the device further includes a third thread control unit, and the third thread control unit is configured to:
  • the master node of the n synchronization operations dirty pages same proportion standby node is w 1, ..., w n, where, c 1 and w 1 corresponds, ..., c n and w n corresponding to, n is a positive integer equal to or greater than 2;
  • L m is the load value of the primary node at the current time
  • a synchronization request is generated, the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node;
  • the second thread control unit 1102 is further specifically configured to:
  • the first thread control unit 1101 is further specifically configured to:
  • the synchronization request is processed according to the result of performing the consistency negotiation process on the synchronization request.
  • the load threshold used by the device for data synchronization is a dynamic and more preferable threshold, and the virtual machine resource utilization ratio and the same dirty page ratio of the active and standby nodes can reach a better balance point when data synchronization is performed. .
  • the third thread control unit is further configured to:
  • SUM k is the sum of the load value obtained from the first load measurement of the master node to the load value obtained from the kth load measurement, and k is a positive integer.
  • the above embodiment can determine an initial load threshold so that the timing at which the primary node synchronizes data for the first time can be determined.
  • the load value of the primary node includes a processor load value and a memory load value
  • the load threshold of the primary node includes a processor load threshold and a memory load threshold
  • the relationship between the processor load value and the processor load threshold may be compared first, and then the relationship between the memory load value and the memory load threshold may be compared, or the relationship between the memory load value and the memory load threshold may be compared first. Then compare the relationship between the processor load value and the processor load threshold, so that the timing of data synchronization between the active and standby nodes can be flexibly determined.
  • FIG. 12 shows another possible schematic diagram of the master node involved in the present application.
  • the master node 1200 includes a processor 1202, a transceiver 1203, and a memory 1201.
  • the transceiver 1203, the processor 1202, and the memory 1201 can communicate with each other through an internal connection path to transfer control and/or data signals.
  • the processing unit 1102 can be a processor or a controller, for example, can be a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), and an application-specific integrated circuit. , ASIC), field programmable gate array (FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the communication unit 1103 can be a transceiver, a transceiver circuit, or the like.
  • the storage unit 1101 may be a memory.
  • the master node 1200 provided by the present application processes the consistency negotiation of the active and standby nodes by using the second thread, and the second thread does not need to occupy the global mutex when working. Therefore, the master node 1200 can utilize the synchronous operation of the active and standby virtual machines.
  • the virtual machine handles other tasks and improves the performance of the primary node.
  • the master node in the device and the method embodiment corresponds completely, and the corresponding module performs corresponding steps, for example, the communication module method performs the steps of sending or receiving in the method embodiment, and the steps other than sending and receiving may be performed by the processing module or the processor. carried out.
  • the communication module method performs the steps of sending or receiving in the method embodiment, and the steps other than sending and receiving may be performed by the processing module or the processor. carried out.
  • the size of the sequence number of each process does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the present application.
  • the function of the virtual machine may also be implemented by using a container, where the container and the virtual machine may be referred to as a virtual device.
  • the steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware or may be implemented by a processor executing software instructions.
  • the software instructions may be composed of corresponding software modules, which may be stored in a random access memory (RAM), a flash memory, a read only memory (ROM), an erasable programmable read only memory ( Erasable programmable ROM (EPROM), electrically erasable programmable read only memory (EEPROM), registers, hard disk, removable hard disk, compact disk read only (CD-ROM) or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in the master node. Of course, the processor and the storage medium can also exist as discrete components in the master node.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions can be stored in or transmitted by a computer readable storage medium.
  • the computer instructions may be from a website site, computer, server or data center via a wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.) Another website site, computer, server, or data center for transmission.
  • the computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
  • the usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a digital versatile disc (DVD), or a semiconductor medium (eg, a solid state disk (SSD)). Wait.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Provided are a data synchronization processing method and apparatus, which are applied to a master node in a computer system, wherein the computer system further comprises a standby node connected to the master node. The method comprises: acquiring, by means of a first thread, first information to be processed, wherein the first information is a first data packet or first indication information, the first indication information is used to indicate the first data packet, and the first thread is a thread for executing a non-thread security code; writing, by means of the first thread, the first information to be processed in a buffer module; executing, by means of a second thread, consistency negotiation processing on the first information to be processed, wherein the consistency negotiation processing is used to synchronize the orders in which the master node and the standby node process the first data packet; and processing the first packet by means of the first thread according to a result of the consistency negotiation processing. By means of the method and the apparatus, a master node processes other tasks by means of a master virtual machine while carrying out synchronization processing on the master virtual machine and a standby virtual machine, thereby improving the performance of the master node.

Description

数据同步处理的方法和装置Method and device for data synchronization processing 技术领域Technical field
本申请涉及计算机领域,尤其涉及一种数据同步处理的方法和装置。The present application relates to the field of computers, and in particular, to a method and apparatus for data synchronization processing.
背景技术Background technique
虚拟化(virtualization)是计算节点(以下,简称为“节点”)的一部分,用于提供隔离的虚拟化计算环境,虚拟化的一个典型的例子是虚拟机(virtual machine,VM)。虚拟机指通过虚拟机软件在物理设备上模拟出的虚拟设备。对于在虚拟机中运行的应用程序而言,这些虚拟机就像真正的物理设备那样进行工作,虚拟机上可以安装操作系统和应用程序,虚拟机还可访问网络资源。Virtualization is part of a compute node (hereafter referred to as a "node") that provides an isolated virtualized computing environment. A typical example of virtualization is a virtual machine (VM). A virtual machine is a virtual device that is simulated on a physical device by virtual machine software. For applications running in virtual machines, these virtual machines work just like real physical devices, which can have operating systems and applications installed on them, and virtual machines can access network resources.
为了提高虚拟机处理业务的可靠性,通常在主备虚拟机上配置相同的数据库(即,分布式数据库),使主备虚拟机处理相同的业务,当主虚拟机发生故障无法正常工作时,备虚拟机自动接管业务。由于主备虚拟机的工作状态通常不会完全相同,因此,备虚拟机在接管业务之前需要一定的时间来同步主备虚拟机的工作状态,主备虚拟机需要同步的数据量越小,备虚拟机接管业务所需的时间越短。To improve the reliability of the virtual machine processing service, you can configure the same database (that is, the distributed database) on the active and standby VMs to ensure that the active and standby VMs can handle the same service. The virtual machine automatically takes over the business. The working status of the active and standby VMs is usually not the same. Therefore, the standby VM needs a certain amount of time to synchronize the working status of the active and standby VMs before taking over the services. The smaller the amount of data that the active and standby VMs need to synchronize, the smaller. The shorter the time it takes for the virtual machine to take over the business.
现有技术中,主虚拟机所在节点(即,主节点)基于一致性协商协议进行主备虚拟机的同步处理,例如,使主备虚拟机按照相同的顺序处理数据包,定期或不定期地同步主备虚拟机的状态,从而减小了主备虚拟机的工作状态的差异。In the prior art, the node where the primary virtual machine is located (that is, the primary node) performs synchronization processing of the active and standby virtual machines based on the consistency negotiation protocol. For example, the active and standby virtual machines process data packets in the same order, periodically or irregularly. Synchronize the status of the active and standby VMs, thus reducing the difference in the working status of the active and standby VMs.
然而,主节点的线程(例如,主循环线程)在进行主备虚拟机的同步处理时需要占据全局互斥锁,该全局互斥锁禁止其它线程访问主虚拟机对应的代码,导致主虚拟机在主线程进行同步处理时无法处理其它任务,从而使得主虚拟机的性能大幅下降。However, the thread of the master node (for example, the main loop thread) needs to occupy the global mutex when performing the synchronization processing of the master and slave VMs. The global mutex lock prohibits other threads from accessing the code corresponding to the master virtual machine, resulting in the master virtual machine. Other tasks cannot be processed while the main thread is synchronizing, resulting in a significant drop in the performance of the primary virtual machine.
发明内容Summary of the invention
本申请提供了一种数据同步处理的方法和装置,能够使得主节点在进行主备虚拟机的同步处理时利用主虚拟机处理其它任务,提高主节点的性能。The present application provides a method and apparatus for data synchronization processing, which enables a master node to use a primary virtual machine to process other tasks while performing synchronization processing of the primary and secondary virtual machines, thereby improving performance of the primary node.
第一方面,提供了一种数据同步处理方法,应用于计算机系统中的主节点的模拟器,该模拟器用于为主节点的第一虚拟设备模拟硬件设备,该计算机系统还包括与主节点连接的备节点,该方法包括:通过模拟器的第一线程获取第一待处理信息,第一待处理信息为第一数据包或第一指示信息,第一指示信息用于指示第一数据包,其中,第一线程为执行非线程安全代码的线程;通过第一线程将第一待处理信息写入缓冲模块;通过模拟器的第二线程对第一待处理信息执行一致性协商处理,该一致性协商处理用于同步主节点和备节点处理第一数据包的顺序;通过第一线程根据一致性协商处理的结果处理第一数据包。In a first aspect, a data synchronization processing method is provided, which is applied to a simulator of a master node in a computer system, the simulator is used to simulate a hardware device of a first virtual device of a master node, and the computer system further includes a connection with the master node. The standby node, the method includes: acquiring, by the first thread of the simulator, the first to-be-processed information, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet, The first thread is a thread that executes the non-thread-safe code; the first to-be-processed information is written into the buffer module by the first thread; and the second pending thread of the simulator performs the consistency negotiation process on the first to-be-processed information, the consistency The sex negotiation process is used to synchronize the order in which the primary node and the standby node process the first data packet; the first data packet is processed by the first thread according to the result of the consistency negotiation process.
主节点可以调动第一线程和第二线程执行代码以完成相应的任务。第一线程为执行非线程安全代码的线程,因此,第一线程在执行操作时需要占据互斥锁,例如,第一线程在获取第一待处理信息之前需要占据全局互斥锁,本申请对第一线程获取第一待处理信息的 方式不作限定。第一线程获取第一待处理信息之后,将第一待处理信息写入缓冲模块,该缓冲模块可以是缓冲队列,也可以是用于缓冲第一待处理信息的堆(heap)或栈(stack),还可以是其它用于缓冲第一待处理信息的数据结构,本申请对此不作限定。第一线程将第一待处理信息写入缓冲模块后即可释放全局互斥锁,其它线程可以占据全局互斥锁并调度虚拟机执行其它任务。第二线程读取缓冲模块中的至少一个待处理信息,并基于一致性协商协议确定主备节点处理数据包的共同顺序,随后,第一线程占据全局互斥锁并按照第二线程确定的处理顺序处理数据包。由于主备节点进行一致性协商的工作是由第二线程完成的,第二线程工作时无需占据全局互斥锁,因此,主节点可以在进行主备虚拟机的同步处理时利用主虚拟机处理其它任务,提高了主节点的性能。The master node can mobilize the first thread and the second thread to execute the code to complete the corresponding task. The first thread is a thread that executes the non-thread-safe code. Therefore, the first thread needs to occupy the mutex when performing the operation. For example, the first thread needs to occupy the global mutex before acquiring the first to-be-processed information. The manner in which the first thread acquires the first to-be-processed information is not limited. After the first thread obtains the first to-be-processed information, the first to-be-processed information is written into the buffer module, where the buffer module may be a buffer queue or a heap or stack for buffering the first to-be-processed information. It may also be another data structure for buffering the first to-be-processed information, which is not limited in this application. After the first thread writes the first pending information to the buffer module, the global mutex can be released, and other threads can occupy the global mutex and schedule the virtual machine to perform other tasks. The second thread reads at least one to-be-processed information in the buffer module, and determines a common order in which the primary and secondary nodes process the data packets based on the consistency negotiation protocol, and then the first thread occupies the global mutex and processes according to the second thread. Process packets in sequence. Because the consistency negotiation between the active and standby nodes is performed by the second thread, the second thread does not need to occupy the global mutex when working. Therefore, the master node can use the primary virtual machine to process the synchronous processing of the active and standby virtual machines. Other tasks improve the performance of the primary node.
可选地,通过模拟器的第二线程对第一待处理信息执行一致性协商处理,包括:通过第二线程从缓冲模块中读取第一待处理信息;通过第二线程对第一待处理信息执行一致性协商处理,确定第一数据包的被处理顺序;根据第一数据包的被处理顺序通过第二线程将第一待处理信息写入管道,该管道用于第一线程读取第一待处理信息。Optionally, performing a consistency negotiation process on the first to-be-processed information by the second thread of the simulator, including: reading, by the second thread, the first to-be-processed information from the buffering module; The information execution consistency negotiation process determines the processed order of the first data packet; writes the first pending processing information to the pipeline through the second thread according to the processed order of the first data packet, where the pipeline is used for reading by the first thread A pending message.
第一数据包可以是从客户端获取的数据包,也可以是主节点生成的数据包,还可以是其它数据包,本申请对第一数据包的具体内容不作限定。由于主节点的一些程序代码是非线程安全的,因此,第二线程作为一个工作者线程不能直接调用主节点的程序代码,本实施例提供的一致性协商处理方案在第一线程和第二线程之间建立一个用于联系的管道,第二线程将一致性协商的结果写入管道,以便于第一线程通过管道读取一致性协商的结果,从而可以在完成一致性协商的同时避免对主节点的安全性造成影响。The first data packet may be a data packet obtained from the client, or may be a data packet generated by the master node, or may be other data packets. The specific content of the first data packet is not limited in this application. Since some program code of the master node is non-thread-safe, the second thread can not directly call the program code of the master node as a worker thread. The consistency negotiation processing scheme provided in this embodiment is in the first thread and the second thread. A pipeline is established for the connection, and the second thread writes the result of the consistency negotiation to the pipeline, so that the first thread reads the result of the consistency negotiation through the pipeline, thereby avoiding the master node while completing the consistency negotiation. The impact of security.
可选地,通过第二线程从缓冲模块中读取第一待处理信息,包括:在预设时间通过第二线程从缓冲模块中读取第一待处理信息缓冲模块。Optionally, the reading, by the second thread, the first to-be-processed information from the buffer module includes: reading the first to-be-processed information buffer module from the buffer module by using the second thread at a preset time.
在本实施例中,预设时间例如是定时器事件对应的时间,第二线程可以基于定时器事件的触发从缓冲模块中读取第一待处理信息,主节点可以设置不同的定时器事件,因此,上述实施例可以灵活触发第二线程进行一致性协商处理。In this embodiment, the preset time is, for example, a time corresponding to the timer event, and the second thread may read the first to-be-processed information from the buffer module based on the trigger of the timer event, and the master node may set different timer events. Therefore, the foregoing embodiment can flexibly trigger the second thread to perform the consistency negotiation process.
可选地,通过第二线程从缓冲模块中读取第一待处理信息之前,所述方法还包括:通过第二线程获取缓冲模块的独占权限,缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问缓冲模块;根据通过第二线程对第一待处理信息执行一致性协商处理之后,所述方法还包括:当缓冲模块中待处理信息的数量为0时,释放第二线程获取的缓冲模块的独占权限。Optionally, before the first to-be-processed information is read from the buffer module by using the second thread, the method further includes: obtaining, by the second thread, the exclusive permission of the buffer module, where the exclusive permission of the buffer module is used to prohibit two or two More than one thread accesses the buffer module at the same time; after performing the consistency negotiation process on the first to-be-processed information by the second thread, the method further includes: when the number of pieces of information to be processed in the buffer module is 0, releasing the The exclusive permission of the buffer module obtained by the second thread.
当第二线程开始工作时,首先占据缓冲模块的独占权限,该独占权限也可以称为队列互斥锁,用于禁止两个或两个以上的线程在同一时刻访问缓冲模块。当缓冲模块中的待处理信息数量为0时第二线程释放队列互斥锁,其它线程可以继续向缓冲模块中写入新的待处理信息。上述实施例可以避免新的待处理信息插入已经完成一致性协商处理的待处理信息队列中,从而提高了一致性协商处理的可靠性和效率。When the second thread starts working, it first occupies the exclusive permission of the buffer module. The exclusive permission may also be called a queue mutual exclusion lock, which is used to prohibit two or more threads from accessing the buffer module at the same time. When the number of pending information in the buffer module is 0, the second thread releases the queue mutex, and other threads can continue to write new pending information to the buffer module. The foregoing embodiment can prevent the new pending information from being inserted into the to-be-processed information queue that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
可选地,通过第二线程对第一待处理信息执行一致性协商处理,包括:通过第二线程确定缓冲模块中待处理信息的数量;当缓冲模块中待处理信息的数量大于0时,通过第二线程将待处理信息对应的数据包(包括第一数据包)写入一致性日志并删除缓冲模块中的待处理信息,一致性日志用于缓存数据包,一致性日志中的数据包的先后顺序与一致性日志中的数据包的被处理顺序相对应;通过第二线程发送包括第一数据包的一致性协商请 求,该一致性协商请求用于请求备节点接受第一数据包的被处理顺序;通过第二线程接收协商完成消息,该协商完成消息用于指示第一数据包的被处理顺序已被接受。Optionally, performing the consistency negotiation process on the first to-be-processed information by using the second thread, including: determining, by the second thread, the quantity of information to be processed in the buffer module; and when the quantity of the to-be-processed information in the buffer module is greater than 0, The second thread writes the data packet (including the first data packet) corresponding to the to-be-processed information into the consistency log and deletes the to-be-processed information in the buffer module, and the consistency log is used to cache the data packet, and the data packet in the consistency log The sequence corresponds to the processed order of the data packets in the consistency log; the second thread sends a consistency negotiation request including the first data packet, and the consistency negotiation request is used to request the standby node to accept the first data packet. Processing sequence; receiving, by the second thread, a negotiation completion message, the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
第二线程读取待处理信息后执行一致性协商处理,随后删除缓冲模块中的待处理信息,这样,可以保证第二线程每次读取的缓冲模块中的指示信息都是新的待处理信息,避免第二线程读到被处理过的待处理信息,从而提高了一致性协商处理的效率。After the second thread reads the to-be-processed information, the consistency negotiation process is performed, and then the to-be-processed information in the buffer module is deleted, so that the indication information in the buffer module read by the second thread each time is new pending information. The second thread is prevented from reading the processed information to be processed, thereby improving the efficiency of the consistency negotiation process.
可选地,通过第一线程将第一待处理信息写入缓冲模块之前,所述方法还包括:通过第一线程获取缓冲模块的独占权限,缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问缓冲模块;通过第一线程将第一待处理信息写入缓冲模块之后,所述方法还包括:通过第一线程释放第一线程获取的缓冲模块的独占权限。Optionally, before the first to-be-processed information is written to the buffer module by the first thread, the method further includes: obtaining, by the first thread, exclusive rights of the buffer module, where the exclusive permission of the buffer module is used to prohibit two or two The above-mentioned thread accesses the buffer module at the same time; after the first thread writes the first to-be-processed information to the buffer module, the method further includes: releasing, by the first thread, the exclusive permission of the buffer module acquired by the first thread.
第一线程在写入缓冲模块之前,首先占据缓冲模块的独占权限,该独占权限也可以称为队列互斥锁,用于禁止两个或两个以上的线程在同一时刻访问缓冲模块。当第一线程写入缓冲模块完成后释放队列互斥锁,第二线程可以占据队列互斥锁并读取缓冲模块中的待处理信息。上述实施例可以避免新的待处理信息插入已经完成一致性协商处理的待处理信息的队列中,从而提高了一致性协商处理的可靠性和效率。Before the first thread writes to the buffer module, it first occupies the exclusive permission of the buffer module. The exclusive permission may also be called a queue mutual exclusion lock, which is used to prohibit two or more threads from accessing the buffer module at the same time. When the first thread write buffer module completes releasing the queue mutex lock, the second thread can occupy the queue mutex lock and read the pending information in the buffer module. The foregoing embodiment can prevent the new pending information from being inserted into the queue of the information to be processed that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
可选地,第一虚拟设备中运行有主数据库,备节点设置有第二虚拟设备,第二虚拟设备中运行有备用数据库,第一数据包携带有客户端发送至主节点的针对主数据库的访问请求,Optionally, the first virtual device runs a primary database, the standby node is configured with a second virtual device, and the second virtual device runs a standby database, where the first data packet carries the client for sending to the primary node for the primary database. Access request,
通过模拟器的第一线程获取第一待处理信息,包括:通过第一线程从主节点的物理网卡获取第一待处理信息;Obtaining the first to-be-processed information by using the first thread of the simulator, including: acquiring, by the first thread, the first to-be-processed information from the physical network card of the primary node;
通过第一线程根据一致性协商处理的结果处理第一数据包,包括:通过第一线程将第一数据包同时发送至主数据库和备用数据库,以使得主节点和备节点处理按照相同的顺序处理第一数据包。Processing, by the first thread, the first data packet according to the result of the consistency negotiation process, including: sending, by the first thread, the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node processing are processed in the same order The first packet.
可选地,所述方法还包括:Optionally, the method further includes:
通过模拟器的第三线程获取n次同步操作时主节点的负载阈值和主节点与备节点的相同脏页比例,该n次同步操作时主节点的负载阈值为c 1,…,c n,该n次同步操作时主节点与备节点的相同脏页比例为w 1,…,w n,其中,c 1与w 1对应,…,c n与w n对应,n为大于或等于2的正整数; The third thread of the simulator obtains the load threshold of the master node and the same dirty page ratio of the master node and the standby node in the n-time synchronization operation, and the load threshold of the master node in the n-time synchronization operation is c 1 , . . . , c n , same dirty pages proportion to the n-th primary node and the standby node when the synchronization is w 1, ..., w n, where, c 1 corresponding to w 1, ..., c n and W n corresponds, n is greater than or equal to 2, Positive integer
通过第三线程确定w m,w m为该n次同步操作之后的当前时刻的负载阈值,w m=[(c 1×w 1)+...+(c n×w n)]÷n,m为正整数; Determining, by the third thread, w m , w m is the load threshold of the current time after the n times of synchronization operations, w m =[(c 1 ×w 1 )+...+(c n ×w n )]÷n , m is a positive integer;
通过第三线程获取L m,L m为当前时刻主节点的负载值; Obtain L m through the third thread, and L m is the load value of the primary node at the current time;
若L m≤w m,则通过第三线程生成同步请求,该同步请求用于请求同步主节点和备节点的脏页; If L m ≤ w m , generating a synchronization request by the third thread, the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node;
通过第三线程将该同步请求写入缓冲模块;Writing the synchronization request to the buffer module through the third thread;
通过第二线程对该同步请求执行一致性协商处理,对该同步请求执行一致性协商处理的结果用于同步主节点和备节点处理同步请求的顺序;Performing a consistency negotiation process on the synchronization request by the second thread, and performing a consistency negotiation process on the synchronization request is used to synchronize the order in which the primary node and the standby node process the synchronization request;
通过第一线程根据对该同步请求执行一致性协商处理的结果处理该同步请求。The synchronization request is processed by the first thread according to the result of performing the consistency negotiation process on the synchronization request.
现有技术中,主节点根据当前时刻的负载值与固定的负载阈值比较确定是否启动主备节点的同步,若当前时刻的负载值小于固定的负载阈值,则不启动主备节点的同步;若当前时刻的负载值大于或等于固定的负载阈值,则启动主备节点的同步。上述现有技术的缺 点是根据固定的负载阈值很难确定进行主备节点同步的最佳启动时机,原因在于:若固定的负载阈值设定过小,例如将固定的负载阈值设置为0,虽然主节点的负载值满足该条件时主备节点的相同脏页比例最高(因为此时主备节点的虚拟机都不再工作,脏页不再发生变化),但是主备节点的虚拟机从负载检测的时刻到数据同步时刻之间一直处于闲置状态,导致主备节点的虚拟机资源被浪费;若固定的负载阈值设定过大,则主备节点的虚拟机在数据同步时仍处于工作状态,主备节点的虚拟机相同脏页比例较小,导致主备节点需要传输较多的数据(即,相异的脏页对应的数据),从而导致主备节点的数据同步需要消耗较多的网络资源。In the prior art, the master node determines whether to synchronize the synchronization between the active and standby nodes according to the current load value and the fixed load threshold. If the current load value is less than the fixed load threshold, the synchronization between the active and standby nodes is not started. When the load value at the current time is greater than or equal to the fixed load threshold, the synchronization of the active and standby nodes is started. The above prior art has the disadvantage that it is difficult to determine the optimal starting timing for synchronizing the active and standby nodes according to a fixed load threshold, because if the fixed load threshold is set too small, for example, the fixed load threshold is set to 0, although When the load value of the master node meets the condition, the ratio of the same dirty pages of the active and standby nodes is the highest (because the virtual machines of the active and standby nodes are no longer working, the dirty pages are no longer changed), but the virtual machines of the primary and secondary nodes are loaded. The virtual machine resources of the active and standby nodes are wasted when the time between the detection and the data synchronization is idle. If the fixed load threshold is set too large, the virtual machines of the active and standby nodes are still working when the data is synchronized. The virtual machine of the active and standby nodes has a small proportion of the same dirty pages. As a result, the active and standby nodes need to transmit more data (that is, data corresponding to different dirty pages), which results in more data synchronization between the active and standby nodes. Internet resources.
下面举一个例子说明本申请提供的技术方案是如何解决上述问题的。例如,第一次负载检测时主节点的处理器工作时间为10分钟,主备节点的虚拟机相同脏页比例为80%,第二次负载检测时主节点的处理器工作时间为20分钟,主备节点的虚拟机相同脏页比例为85%,第三次负载检测时主节点的处理器工作时间为20分钟,主备节点的虚拟机相同脏页比例为85%。上述数据说明主节点的虚拟机至少在第二次负载检测时已经停止了工作,有可能在第二次负载检测之前主节点的虚拟机就已经停止了工作,若在第二次负载检测后开始数据同步,则必然造成主节点的虚拟机被闲置,虚拟机资源被浪费,因此,优选的数据同步时机在第一次负载检测之后以及第二次负载检测之前,当数据同步开始时,主节点的虚拟机已完成了大部分工作或者全部工作,这样做可以取得虚拟机资源利用率与相同脏页比例的较佳平衡点。The following is an example to illustrate how the technical solution provided by the present application solves the above problems. For example, in the first load detection, the processor working time of the master node is 10 minutes, the virtual machine of the active and standby nodes has the same dirty page ratio of 80%, and the processor life of the master node of the second load detection is 20 minutes. The virtual machine of the active and standby nodes has the same dirty page ratio of 85%. When the load is detected for the third time, the processor working time of the primary node is 20 minutes, and the virtual machine of the active and standby nodes has the same dirty page ratio of 85%. The above data indicates that the virtual machine of the primary node has stopped working at least during the second load detection. It is possible that the virtual machine of the primary node has stopped working before the second load detection, and starts after the second load detection. Data synchronization will inevitably cause the virtual machine of the primary node to be idle, and the virtual machine resources will be wasted. Therefore, the preferred data synchronization timing is after the first load detection and before the second load detection, when the data synchronization starts, the primary node The virtual machine has completed most of the work or all of its work, which can achieve a better balance between virtual machine resource utilization and the same dirty page ratio.
按照上述本申请提供的实施例,根据至少两次同步操作时主节点的负载阈值和相同脏页比例确定负载阈值,例如,将第一次负载检测时获得的相同脏页比例80%作为加权值乘以负载阈值5得到结果4,将第二次负载检测时获得的相同脏页比例85%作为加权值乘以负载阈值6得到结果5.1,将4和5.1相加再除以负载检测次数2得到该两次负载检测的负载阈值的加权平均值4.55,该加权平均值即为新的负载阈值。若第三次负载检测得到的处理器工作时间为22分钟,则第三次负载检测时主节点的负载值为2(第三次负载检测得到的工作时间22减去第二次负载检测得到的工作时间20得到第三次负载检测时主节点的负载值2),该负载值小于上述新的负载阈值4.55,说明主节点的虚拟机的剩余任务不多了,主节点的虚拟机很快就会进入闲置状态,则开始执行数据同步操作;若第三次负载检测得到的处理器工作时间为30,则第三次负载检测时主节点的负载值为10(第三次负载检测得到的工作时间30减去第二次负载检测得到的工作时间20得到第三次负载检测时主节点的负载值2),该负载值大于上述新的负载阈值4.55,说明主节点的虚拟机的剩余任务还有很多,主备节点的相同脏页比例较小,则在当前时刻不执行数据同步操作,待第四次负载检测后,根据第四次负载检测的确定的负载值与新的负载阈值4.55的大小关系确定是否执行数据同步操作。According to the embodiment provided by the present application, the load threshold is determined according to the load threshold of the primary node and the same dirty page ratio at least two synchronization operations, for example, the same dirty page ratio 80% obtained at the first load detection is used as the weight value. Multiply the load threshold value 5 to obtain the result 4, and the same dirty page ratio 85% obtained at the second load detection is multiplied by the load threshold value 6 to obtain the result 5.1, and the sum of 4 and 5.1 is divided by the load detection number 2 to obtain The weighted average of the load thresholds for the two load tests is 4.55, which is the new load threshold. If the processor working time obtained by the third load detection is 22 minutes, the load value of the master node in the third load detection is 2 (the working time 22 obtained by the third load detection minus the second load detection). The working time 20 obtains the load value of the master node when the load is detected for the third time. 2), the load value is less than the new load threshold of 4.55, indicating that the remaining tasks of the virtual machine of the master node are not much, and the virtual machine of the master node is soon Will enter the idle state, then start the data synchronization operation; if the processor working time of the third load detection is 30, the load value of the master node is 10 when the load is detected for the third time (the work of the third load detection) Time 30 minus the working time 20 obtained by the second load detection to obtain the load value of the master node in the third load detection 2), the load value is greater than the new load threshold 4.55, indicating that the remaining tasks of the virtual machine of the master node are still There are many. If the proportion of the same dirty page of the active and standby nodes is small, the data synchronization operation will not be performed at the current time. After the fourth load detection, the negative according to the determination of the fourth load detection. The new load value threshold of 4.55 is determined whether the magnitude relationship data synchronization operations.
此外,由于本实施例中新的负载阈值是根据多次负载测量的结果确定的加权平均值,因此,该负载阈值将随着负载检测次数的增多逐渐收敛为一个较优选的负载阈值。In addition, since the new load threshold in this embodiment is a weighted average determined according to the result of the multiple load measurement, the load threshold will gradually converge to a more preferable load threshold as the number of load detections increases.
综上,本实施例提供的数据同步的方法中,负载阈值为一个动态的较优选的阈值,可以在进行数据同步时使得虚拟机资源利用率与主备节点的相同脏页比例达到一个较佳的平衡点。In summary, in the data synchronization method provided by the embodiment, the load threshold is a dynamic and preferable threshold, and the virtual machine resource utilization ratio and the same dirty page ratio of the active and standby nodes are better when data synchronization is performed. The balance point.
可选地,通过第三线程获取n次同步操作时主节点的负载阈值和主节点与备节点的 相同脏页比例之前,所述方法还包括:Optionally, before the third thread obtains the load threshold of the primary node and the same dirty page ratio of the primary node and the standby node, the method further includes:
通过第三线程获取SUM k,SUM k为主节点的第一次负载测量得到的负载值至第k次负载测量得到的负载值的和,k为正整数; Obtaining SUM k through the third thread, SUM k is the sum of the load value obtained from the first load measurement of the primary node to the load value obtained by measuring the kth load, and k is a positive integer;
当k≥T count时,通过第三线程确定c 0,T count为负载测量次数阈值,c 0为主节点首次同步操作的负载阈值,c 0=SUM k÷k;或者, When k≥T count , c 0 is determined by the third thread, T count is the load measurement threshold, and c 0 is the load threshold of the first synchronization operation of the master node, c 0 =SUM k ÷k;
当k<T count时,通过第三线程获取L k+1,L k+1为第k+1次负载测量得到的主节点的负载值,T count为负载测量次数阈值;通过第三线程获取SUM k+1,SUM k+1=SUM k+L k+1;当k+1≥T count时,通过第三线程确定c 0,c 0为主节点首次同步操作的负载阈值,c 0=SUM k+1÷(k+1)。 When k<T count , the third thread obtains L k+1 , L k+1 is the load value of the primary node obtained by the k+1th load measurement, and T count is the load measurement number threshold; SUM k+1 , SUM k+1 =SUM k +L k+1 ; When k+1≥T count , the third thread determines c 0 , c 0 as the load threshold of the first synchronization operation of the master node, c 0 = SUM k+1 ÷(k+1).
例如,主节点的虚拟机启动后,负载的测量次数(COUNT)等于0,第一次负载测量后,得到第一负载值L 1,则SUM 1等于L 1,第二次负载测量后,得到第二负载值L 2,则SUM 2=SUM 1+L 2,即,SUM 2与SUM 1正相关,且SUM 2与L 2正相关。若测量次数阈值T count为2,则初始负载阈值c 0等于SUM 2除以2,即,初始负载阈值与SUM 2正相关,且初始负载阈值与测量次数负相关;若测量次数阈值T count为3,由于当前仅进行了两次负载测量,因此,还需要再进行一次负载测量,即,确定第三负载值L 3,随后计算SUM 3,SUM 3=SUM 2+L 3,即,SUM3与SUM 2正相关,且SUM 3与L 3正相关,初始负载阈值c 0等于SUM 3除以3,即,初始负载阈值与SUM 3正相关,且初始负载阈值与测量次数负相关。 For example, after the virtual machine of the primary node is started, the number of measurements (COUNT) of the load is equal to 0. After the first load measurement, the first load value L 1 is obtained , then SUM 1 is equal to L 1 , and after the second load measurement, a second load value L 2, the SUM 2 = SUM 1 + L 2 , i.e., SUM 1 and SUM 2 positive correlation, and SUM 2 L 2 with a positive correlation. If the measurement number threshold T count is 2, the initial load threshold c 0 is equal to SUM 2 divided by 2, that is, the initial load threshold is positively correlated with SUM 2 , and the initial load threshold is negatively correlated with the number of measurements; if the measurement number threshold T count is 3. Since only two load measurements have been performed so far, it is necessary to perform another load measurement, that is, to determine the third load value L 3 , and then calculate SUM 3 , SUM 3 =SUM 2 +L 3 , ie, SUM3 and SUM 2 is positively correlated, and SUM 3 is positively correlated with L 3 , the initial load threshold c 0 is equal to SUM 3 divided by 3, ie, the initial load threshold is positively correlated with SUM 3 and the initial load threshold is inversely correlated with the number of measurements.
上述实施例可以确定一个初始负载阈值,从而可以确定主节点首次进行数据同步的时机。The above embodiment can determine an initial load threshold so that the timing at which the primary node synchronizes data for the first time can be determined.
可选地,上述主节点的负载值包括处理器负载值和存储器负载值,上述主节点的负载阈值包括处理器负载阈值和存储器负载阈值。Optionally, the load value of the primary node includes a processor load value and a memory load value, and the load threshold of the primary node includes a processor load threshold and a memory load threshold.
在本实施例中,可以先比较处理器负载值与处理器负载阈值的大小关系,再比较存储器负载值与存储器负载阈值的大小关系,也可以先比较存储器负载值与存储器负载阈值的大小关系,再比较处理器负载值与处理器负载阈值的大小关系,从而可以灵活确定主备节点进行数据同步的时机。In this embodiment, the relationship between the processor load value and the processor load threshold may be compared first, and then the relationship between the memory load value and the memory load threshold may be compared, or the relationship between the memory load value and the memory load threshold may be compared first. Then compare the relationship between the processor load value and the processor load threshold, so that the timing of data synchronization between the active and standby nodes can be flexibly determined.
第二方面,提供了一种数据同步处理装置,应用于计算机系统中的主节点的模拟器,该模拟器用于为主节点的第一虚拟设备模拟硬件设备,该计算机系统还包括与主节点连接的备节点,该装置包括:In a second aspect, a data synchronization processing apparatus is provided, which is applied to a simulator of a master node in a computer system, the simulator is used to simulate a hardware device of a first virtual device of a master node, and the computer system further includes a connection with the master node. Standby node, the device includes:
第一线程控制单元,用于获取第一待处理信息,第一待处理信息为第一数据包或第一指示信息,第一指示信息用于指示第一数据包,其中,第一线程控制单元用于执行非线程安全代码;以及将第一待处理信息写入缓冲模块;a first thread control unit, configured to acquire first to-be-processed information, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet, where the first thread control unit For executing a non-thread-safe code; and writing the first pending information to the buffer module;
第二线程控制单元,用于对第一待处理信息执行一致性协商处理,该一致性协商处理用于同步主节点和备节点处理第一数据包的顺序;a second thread control unit, configured to perform a consistency negotiation process on the first to-be-processed information, where the consistency negotiation process is used to synchronize the order in which the primary node and the standby node process the first data packet;
第一线程控制单元还用于,根据第二线程控制单元执行一致性协商处理的结果处理第一数据包。The first thread control unit is further configured to process the first data packet according to a result of the second thread control unit performing the consistency negotiation process.
数据同步处理装置可以通过第一线程控制单元和第二线程控制单元执行代码以完成相应的任务。第一线程控制单元用于执行非线程安全代码,因此,第一线程控制单元在执行操作时需要占据互斥锁,例如,第一线程控制单元在获取第一待处理信息之前需要占据 全局互斥锁,本申请对第一线程控制单元获取第一待处理信息的方式不作限定。第一线程控制单元获取第一待处理信息之后,将第一待处理信息写入缓冲模块,该缓冲模块可以是缓冲队列,也可以是用于缓冲第一待处理信息的堆(heap)或栈(stack),还可以是其它用于缓冲第一待处理信息的数据结构,本申请对此不作限定。第一线程控制单元将第一待处理信息写入缓冲模块后即可释放全局互斥锁,其它线程可以占据全局互斥锁并调度虚拟机执行其它任务。第二线程控制单元读取缓冲模块中的至少一个待处理信息,并基于一致性协商协议确定主备节点处理数据包的共同顺序,随后,第一线程控制单元占据全局互斥锁并按照第二线程控制单元确定的处理顺序处理数据包。由于主备节点进行一致性协商的工作是由第二线程控制单元完成的,第二线程控制单元工作时无需占据全局互斥锁,因此,该数据同步处理装置可以是主节点在进行主备虚拟机的同步处理时利用主虚拟机处理其它任务,提高了主节点的性能。The data synchronization processing device can execute the code by the first thread control unit and the second thread control unit to complete the corresponding task. The first thread control unit is configured to execute the non-thread-safe code. Therefore, the first thread control unit needs to occupy the mutex when performing the operation. For example, the first thread control unit needs to occupy the global mutual exclusion before acquiring the first to-be-processed information. The method for obtaining the first to-be-processed information by the first thread control unit is not limited in this application. After acquiring the first to-be-processed information, the first thread control unit writes the first to-be-processed information to the buffer module, where the buffer module may be a buffer queue, or may be a heap or a stack for buffering the first to-be-processed information. (stack), which may be other data structures for buffering the first to-be-processed information, which is not limited in this application. After the first thread control unit writes the first pending information to the buffer module, the global mutex can be released, and other threads can occupy the global mutex and schedule the virtual machine to perform other tasks. The second thread control unit reads at least one to-be-processed information in the buffer module, and determines a common order in which the active and standby nodes process the data packets based on the consistency negotiation protocol, and then the first thread control unit occupies the global mutual exclusion lock and follows the second The processing sequence determined by the thread control unit processes the data packet. Since the work of the consistency negotiation between the active and standby nodes is performed by the second thread control unit, the second thread control unit does not need to occupy the global mutex when working. Therefore, the data synchronization processing device may be the master node performing the active and standby virtual When the machine is synchronized, the main virtual machine is used to process other tasks, which improves the performance of the master node.
可选地,第二线程控制单元具体用于:Optionally, the second thread control unit is specifically configured to:
从缓冲模块中读取所述第一待处理信息;Reading the first to-be-processed information from the buffer module;
对第一待处理信息执行一致性协商处理,确定第一数据包的被处理顺序;Performing a consistency negotiation process on the first to-be-processed information to determine a processed order of the first data packet;
根据第一数据包的被处理顺序将第一待处理信息写入管道,该管道用于第一线程控制单元读取第一待处理信息。The first to-be-processed information is written to the pipeline according to the processed order of the first data packet, and the pipeline is used by the first thread control unit to read the first to-be-processed information.
第一数据包可以是从客户端获取的数据包,也可以是主节点生成的数据包,还可以是其它数据包,本申请对第一数据包的具体内容不作限定。由于主节点的一些程序代码是非线程安全的,因此,第二线程控制单元作为一个工作者线程不能直接调用主节点的程序代码,本实施例提供的一致性协商处理方案在第一线程控制单元和第二线程控制单元之间建立一个用于联系的管道,第二线程控制单元将一致性协商的结果写入管道,以便于第一线程控制单元通过管道读取一致性协商的结果,从而可以在完成一致性协商的同时避免对主节点的安全性造成影响。The first data packet may be a data packet obtained from the client, or may be a data packet generated by the master node, or may be other data packets. The specific content of the first data packet is not limited in this application. Since some program code of the master node is non-thread-safe, the second thread control unit can not directly call the program code of the master node as a worker thread. The consistency negotiation processing scheme provided in this embodiment is in the first thread control unit and A pipeline for establishing a relationship is established between the second thread control units, and the second thread control unit writes the result of the consistency negotiation to the pipeline, so that the first thread control unit reads the result of the consistency negotiation through the pipeline, so that Consistency negotiation is completed while avoiding the impact on the security of the primary node.
可选地,第二线程控制单元具体还用于:在预设时间从所述缓冲模块中读取所述第一待处理信息。Optionally, the second thread control unit is further configured to: read the first to-be-processed information from the buffer module at a preset time.
在本实施例中,预设时间例如是定时器事件对应的时间,第二线程控制单元可以基于定时器事件的触发从缓冲模块中读取第一待处理信息,主节点可以设置不同的定时器事件,因此,上述实施例可以灵活触发第二线程控制单元进行一致性协商处理。In this embodiment, the preset time is, for example, a time corresponding to the timer event, and the second thread control unit may read the first to-be-processed information from the buffer module based on the trigger of the timer event, and the master node may set different timers. The event, therefore, the above embodiment can flexibly trigger the second thread control unit to perform the consistency negotiation process.
可选地,从缓冲模块中读取第一待处理信息之前,第二线程控制单元具体还用于:获取缓冲模块的独占权限,缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问该缓冲模块;Optionally, before the first to-be-processed information is read from the buffer module, the second thread control unit is further configured to: obtain exclusive rights of the buffer module, and the exclusive permission of the buffer module is used to prohibit two or more threads. Accessing the buffer module at the same time;
对第一待处理信息执行一致性协商处理之后,第二线程控制单元具体还用于:当缓冲模块中待处理信息的数量为0时,释放第二线程获取的该缓冲模块的独占权限。After performing the consistency negotiation process on the first to-be-processed information, the second thread control unit is further configured to: when the number of pieces of information to be processed in the buffer module is 0, release the exclusive permission of the buffer module acquired by the second thread.
当第二线程控制单元开始工作时,首先占据缓冲模块的独占权限,该独占权限也可以称为队列互斥锁,用于禁止两个或两个以上的线程控制单元在同一时刻访问缓冲模块。当缓冲模块中的待处理信息数量为0时第二线程控制单元释放队列互斥锁,其它线程可以继续向缓冲模块中写入新的待处理信息。上述实施例可以避免新的待处理信息插入已经完成一致性协商处理的待处理信息队列中,从而提高了一致性协商处理的可靠性和效率。When the second thread control unit starts to work, it first occupies the exclusive right of the buffer module, which may also be called a queue mutex lock, for prohibiting two or more thread control units from accessing the buffer module at the same time. When the number of pieces of information to be processed in the buffer module is 0, the second thread control unit releases the queue mutex, and other threads may continue to write new pending information to the buffer module. The foregoing embodiment can prevent the new pending information from being inserted into the to-be-processed information queue that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
可选地,第二线程控制单元具体还用于:Optionally, the second thread control unit is further configured to:
确定缓冲模块中待处理信息的数量;Determining the amount of information to be processed in the buffer module;
当待处理信息的数量大于0时,将待处理信息对应的数据包写入一致性日志并删除待处理信息,该一致性日志用于缓存待处理信息对应的数据包,一致性日志中的数据包的先后顺序与一致性日志中的数据包的被处理顺序相对应,该待处理信息包括第一待处理信息,该待处理信息对应的数据包包括第一数据包;When the number of the to-be-processed information is greater than 0, the data packet corresponding to the to-be-processed information is written into the consistency log, and the to-be-processed information is deleted. The consistency log is used to cache the data packet corresponding to the to-be-processed information, and the data in the consistency log. The sequence of the packets corresponds to the processed sequence of the data packets in the consistency log, the information to be processed includes the first to-be-processed information, and the data packet corresponding to the to-be-processed information includes the first data packet.
发送包括第一数据包的一致性协商请求,一致性协商请求用于请求备节点接受第一数据包的被处理顺序;Sending a consistency negotiation request including the first data packet, where the consistency negotiation request is used to request the standby node to accept the processed sequence of the first data packet;
接收协商完成消息,该协商完成消息用于指示第一数据包的被处理顺序已被接受。A negotiation completion message is received, the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
第二线程控制单元读取待处理信息后执行一致性协商处理,随后删除缓冲模块中的待处理信息,这样,可以保证第二线程控制单元每次读取的缓冲模块中的指示信息都是新的待处理信息,避免第二线程控制单元读到被处理过的待处理信息,从而提高了一致性协商处理的效率。After the second thread control unit reads the to-be-processed information, the consistency negotiation process is performed, and then the to-be-processed information in the buffer module is deleted, so that the indication information in the buffer module read by the second thread control unit is new. The information to be processed prevents the second thread control unit from reading the processed information to be processed, thereby improving the efficiency of the consistency negotiation process.
可选地,将第一待处理信息写入缓冲模块之前,第一线程控制单元还用于:获取缓冲模块的独占权限,缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问该缓冲模块;Optionally, before the first to-be-processed information is written to the buffer module, the first thread control unit is further configured to: obtain exclusive rights of the buffer module, and the exclusive permission of the buffer module is used to prohibit two or more threads from being in the same Access the buffer module at any time;
将第一待处理信息写入缓冲模块之后,第一线程控制单元还用于:释放第一线程控制单元获取的缓冲模块的独占权限。After the first to-be-processed information is written to the buffer module, the first thread control unit is further configured to: release the exclusive permission of the buffer module acquired by the first thread control unit.
第一线程控制单元在写入缓冲模块之前,首先占据缓冲模块的独占权限,该独占权限也可以称为队列互斥锁,用于禁止两个或两个以上的线程控制单元在同一时刻访问缓冲模块。当第一线程控制单元写入缓冲模块完成后释放队列互斥锁,第二线程控制单元可以占据队列互斥锁并读取缓冲模块中的待处理信息。上述实施例可以避免新的待处理信息插入已经完成一致性协商处理的待处理信息的队列中,从而提高了一致性协商处理的可靠性和效率。The first thread control unit first occupies the exclusive permission of the buffer module before writing to the buffer module, and the exclusive authority may also be referred to as a queue mutex lock, for prohibiting two or more thread control units from accessing the buffer at the same time. Module. When the first thread control unit writes the buffer module to complete the release of the queue mutex lock, the second thread control unit can occupy the queue mutex lock and read the pending information in the buffer module. The foregoing embodiment can prevent the new pending information from being inserted into the queue of the information to be processed that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
可选地,第一虚拟设备中运行有主数据库,备节点设置有第二虚拟设备,第二虚拟设备中运行有备用数据库,第一数据包携带有客户端发送至主节点的针对主数据库的访问请求,Optionally, the first virtual device runs a primary database, the standby node is configured with a second virtual device, and the second virtual device runs a standby database, where the first data packet carries the client for sending to the primary node for the primary database. Access request,
第一线程控制单元具体还用于:从主节点的物理网卡获取第一待处理信息;将第一数据包同时发送至主数据库和备用数据库,以使得主节点和备节点按照相同的顺序处理第一数据包。The first thread control unit is further configured to: obtain first to-be-processed information from the physical network card of the primary node; send the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node process the same in the same order. A packet of data.
可选地,所述装置还包括第三线程控制单元,第三线程控制单元用于:Optionally, the device further includes a third thread control unit, and the third thread control unit is configured to:
获取n次同步操作时主节点的负载阈值和主节点与备节点的相同脏页比例,n次同步操作时主节点的负载阈值为c 1,…,c n,该n次同步操作时主节点与备节点的相同脏页比例为w 1,…,w n,其中,c 1与w 1对应,…,c n与w n对应,n为大于或等于2的正整数; Obtaining the load threshold of the master node and the same dirty page ratio of the master node and the standby node when n times of synchronization operations, and the load threshold of the master node when n times of synchronization operations are c 1 ,..., c n , the master node of the n synchronization operations dirty pages same proportion standby node is w 1, ..., w n, where, c 1 and w 1 corresponds, ..., c n and w n corresponding to, n is a positive integer equal to or greater than 2;
确定w m,w m为n次同步操作之后的当前时刻的负载阈值,w m=[(c 1×w 1)+...+(c n×w n)]÷n,m为正整数; Determining w m , w m is the load threshold at the current time after n synchronization operations, w m =[(c 1 ×w 1 )+...+(c n ×w n )]÷n, m is a positive integer ;
获取L m,L m为当前时刻主节点的负载值; Obtain L m , L m is the load value of the primary node at the current time;
若L m≤w m,则产生同步请求,该同步请求用于请求同步主节点和备节点的脏页; If L m ≤ w m , a synchronization request is generated, the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node;
将同步请求写入所述缓冲模块;Writing a synchronization request to the buffer module;
第二线程控制单元具体还用于:The second thread control unit is also specifically used to:
对同步请求执行一致性协商处理,对同步请求执行一致性协商处理的结果用于同步主节点和备节点处理同步请求的顺序;Performing a consistency negotiation process on the synchronization request, and performing a consistency negotiation process on the synchronization request to synchronize the order in which the primary node and the standby node process the synchronization request;
第一线程控制单元具体还用于:The first thread control unit is also specifically used to:
根据对同步请求执行一致性协商处理的结果处理同步请求。The synchronization request is processed according to the result of performing the consistency negotiation process on the synchronization request.
本实施例提供的数据同步的装置使用的负载阈值为一个动态的较优选的阈值,可以在进行数据同步时使得虚拟机资源利用率与主备节点的相同脏页比例达到一个较佳的平衡点。The load threshold used by the device for data synchronization provided in this embodiment is a dynamic and more preferable threshold, and the virtual machine resource utilization ratio and the same dirty page ratio of the active and standby nodes can reach a better balance point when data synchronization is performed. .
可选地,获取n次同步操作时主节点的负载阈值和主节点与备节点的相同脏页比例之前,第三线程控制单元具体还用于:Optionally, before acquiring the load threshold of the primary node and the same dirty page ratio of the primary node and the standby node, the third thread control unit is further configured to:
获取SUM k,SUM k为主节点的第一次负载测量得到的负载值至第k次负载测量得到的负载值的和,k为正整数; Obtaining SUM k , SUM k is the sum of the load value obtained from the first load measurement of the master node to the load value obtained by measuring the kth load, and k is a positive integer;
当k≥T count时,确定c 0,T count为负载测量次数阈值,c 0为主节点首次同步操作的负载阈值,c 0=SUM k÷k;或者, When k≥T count , it is determined that c 0 , T count is the load measurement threshold, and c 0 is the load threshold of the first synchronization operation of the master node, c 0 =SUM k ÷k;
当k<T count时,获取L k+1,L k+1为第k+1次负载测量得到的主节点的负载值,T count为负载测量次数阈值;获取SUM k+1,SUM k+1=SUM k+L k+1;当k+1≥T count时,确定c 0,c 0为主节点首次同步操作的负载阈值,c 0=SUM k+1÷(k+1)。 When k<T count , obtain L k+1 , L k+1 is the load value of the primary node obtained by the k+1th load measurement, and T count is the threshold of the load measurement times; obtain SUM k+1 , SUM k+ 1 =SUM k +L k+1 ; When k+1≥T count , it is determined that c 0 ,c 0 is the load threshold of the first synchronization operation of the master node, c 0 =SUM k+1 ÷(k+1).
上述实施例可以确定一个初始负载阈值,从而可以确定主节点首次进行数据同步的时机。The above embodiment can determine an initial load threshold so that the timing at which the primary node synchronizes data for the first time can be determined.
可选地,上述主节点的负载值包括处理器负载值和存储器负载值,上述主节点的负载阈值包括处理器负载阈值和存储器负载阈值。Optionally, the load value of the primary node includes a processor load value and a memory load value, and the load threshold of the primary node includes a processor load threshold and a memory load threshold.
在本实施例中,可以先比较处理器负载值与处理器负载阈值的大小关系,再比较存储器负载值与存储器负载阈值的大小关系,也可以先比较存储器负载值与存储器负载阈值的大小关系,再比较处理器负载值与处理器负载阈值的大小关系,从而可以灵活确定主备节点进行数据同步的时机。In this embodiment, the relationship between the processor load value and the processor load threshold may be compared first, and then the relationship between the memory load value and the memory load threshold may be compared, or the relationship between the memory load value and the memory load threshold may be compared first. Then compare the relationship between the processor load value and the processor load threshold, so that the timing of data synchronization between the active and standby nodes can be flexibly determined.
第三方面,提供了一种数据同步处理装置,该装置具有实现第一方面所述的方法的执行设备的功能,其包括用于执行上述方法方面所描述的步骤或功能相对应的部件(means)。所述步骤或功能可以通过软件实现,或硬件(如电路)实现,或者通过硬件和软件结合来实现。In a third aspect, there is provided a data synchronization processing apparatus having the functionality of an execution device implementing the method of the first aspect, comprising means for performing the steps or functions described in the above method aspects (means ). The steps or functions may be implemented by software, or by hardware (such as a circuit), or by a combination of hardware and software.
在一种可能的设计中,上述装置包括一个或多个处理单元以及一个或多个通信单元。所述一个或多个处理单元被配置为支持所述装置实现上述方法的执行设备相应的功能,例如,通过第一线程获取第一待处理信息。所述一个或多个通信单元用于支持所述装置与其它设备通信,实现接收和/或发送功能。例如,从客户端获取第一数据包。In one possible design, the above apparatus includes one or more processing units and one or more communication units. The one or more processing units are configured to support the apparatus to implement a corresponding function of the execution device of the above method, for example, acquiring the first pending information by the first thread. The one or more communication units are configured to support the device to communicate with other devices to implement receiving and/or transmitting functions. For example, the first packet is obtained from the client.
可选的,上述装置还可以包括一个或多个存储器,所述存储器用于与处理器耦合,其保存装置必要的程序指令和/或数据。所述一个或多个存储器可以和处理器集成在一起,也可以与处理器分离设置。本申请并不限定。Optionally, the above apparatus may further comprise one or more memories for coupling with the processor, which store program instructions and/or data necessary for the device. The one or more memories may be integrated with the processor or may be separate from the processor. This application is not limited.
所述装置可以为芯片。其中,上述通信单元可以为芯片的输入/输出电路或者接口。The device can be a chip. The communication unit may be an input/output circuit or an interface of the chip.
另一个可能的设计中,上述装置包括收发器、处理器和存储器。该处理器用于控制收发器或输入/输出电路收发信号,该存储器用于存储计算机程序,该处理器用于运行该存储器中的计算机程序,使得该装置执行第一方面或第一方面中任一种可能实现方式中的方 法。In another possible design, the above apparatus includes a transceiver, a processor, and a memory. The processor is for controlling a transceiver or an input/output circuit for transmitting and receiving signals, the memory for storing a computer program for executing a computer program in the memory, such that the device performs either of the first aspect or the first aspect Possible methods in the implementation.
第四方面,提供了一种计算机系统,该计算机系统包括第一方面所述的主节点和备节点,其中,该主节点用于执行第一方面或第一方面中任一种可能实现方式中的方法。In a fourth aspect, a computer system is provided, the computer system comprising the primary node and the standby node according to the first aspect, wherein the primary node is configured to perform the first aspect or any of the possible implementations of the first aspect Methods.
第五方面,提供了一种计算机可读存储介质,用于存储计算机程序,该计算机程序包括用于执行第一方面或第一方面中任一种可能实现方式中的方法的指令。A fifth aspect, a computer readable storage medium for storing a computer program, the computer program comprising instructions for performing the method of the first aspect or any of the possible implementations of the first aspect.
第六方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述第一方面或第一方面中任一种可能实现方式中的方法。In a sixth aspect, a computer program product is provided, the computer program product comprising: computer program code, when the computer program code is run on a computer, causing the computer to perform any of the first aspect or the first aspect described above Possible methods in the implementation.
附图说明DRAWINGS
图1是一种适用于本申请的计算机系统的示意图;Figure 1 is a schematic illustration of a computer system suitable for use with the present application;
图2是一种适用于本申请的虚拟机状态复制的示意图;2 is a schematic diagram of virtual machine state replication suitable for use in the present application;
图3是本申请提供的一种数据同步处理方法的示意图;3 is a schematic diagram of a data synchronization processing method provided by the present application;
图4是本申请提供的一种主备虚拟机进行数据同步的示意图;4 is a schematic diagram of data synchronization between a primary and secondary virtual machine provided by the present application;
图5是本申请提供的一种确定主备虚拟机进行数据同步的时机的方法示意图;FIG. 5 is a schematic diagram of a method for determining a timing of data synchronization between a primary and a secondary virtual machine according to the present application; FIG.
图6是本申请提供的一种确定初始负载阈值的方法的示意图;6 is a schematic diagram of a method for determining an initial load threshold provided by the present application;
图7是本申请提供的另一数据同步处理方法的示意图;7 is a schematic diagram of another data synchronization processing method provided by the present application;
图8是本申请提供的一种一致性协商方法的示意图;8 is a schematic diagram of a consistency negotiation method provided by the present application;
图9是本申请提供的再一数据同步处理方法的示意图;9 is a schematic diagram of still another data synchronization processing method provided by the present application;
图10是本申请提供的再一数据同步处理方法的示意图;FIG. 10 is a schematic diagram of still another data synchronization processing method provided by the present application; FIG.
图11是本申请提供的数据同步处理装置的一种可能的结构示意图;11 is a schematic structural diagram of a data synchronization processing apparatus provided by the present application;
图12是本申请提供的主节点的另一可能的结构示意图。FIG. 12 is another schematic structural diagram of a master node provided by the present application.
具体实施方式detailed description
下面将结合附图,对本申请中的技术方案进行描述。The technical solutions in the present application will be described below with reference to the accompanying drawings.
图1示出了适用于本申请的一种计算机系统的示意图。Figure 1 shows a schematic diagram of a computer system suitable for use in the present application.
如图1所示,该计算机系统100包括主机1和主机2,主机1包括硬件平台以及安装在该硬件平台上的宿主机操作系统,主机1还包括运行在该宿主机操作系统上的虚拟机1和快速轻便模拟器(quick emulator,Qemu)1,其中,虚拟机1上运行有数据库1。。As shown in FIG. 1, the computer system 100 includes a host 1 and a host 2. The host 1 includes a hardware platform and a host operating system installed on the hardware platform. The host 1 further includes a virtual machine running on the host operating system. 1 and a quick emulator (Qemu) 1, in which a database 1 is running on the virtual machine 1. .
Qemu模拟硬件设备提供给虚拟机使用。除此之外,Qemu还可监控运行在Qemu上的虚拟机的工作负载,虚拟机的工作负载包括虚拟机针对中央处理器(central processing unit,CPU)的占用率以及虚拟机针对磁盘的占用率,其中,中央处理器和磁盘设置在硬件平台中。Qemu emulation hardware devices are provided for use by virtual machines. In addition, Qemu can monitor the workload of virtual machines running on Qemu. The workload of virtual machines includes the occupancy of virtual machines for central processing units (CPUs) and virtual machine disk usage. , where the central processor and disk are set in the hardware platform.
主机2包括硬件平台以及安装在该硬件平台上的宿主机操作系统,主机2还包括运行在该宿主机操作系统上的虚拟机2和Qemu2,其中,虚拟机2上运行有数据库2。The host 2 includes a hardware platform and a host operating system installed on the hardware platform. The host 2 further includes a virtual machine 2 and a Qemu 2 running on the host operating system, wherein the virtual machine 2 runs a database 2.
在本发明实施例中,数据库1为主数据库,数据库2为备用数据库,当数据库1不能使用时,数据库2可取代数据库1称为主数据库以供客户端访问。In the embodiment of the present invention, the database 1 is the primary database, and the database 2 is the standby database. When the database 1 cannot be used, the database 2 can be referred to as the primary database instead of the database 1 for the client to access.
虚拟机1和虚拟机3可以互为主备虚拟机,相应地,主机1和主机2互为主备节点。主机1和主机2可以通过网卡(network interface card,NIC)相互通信,并且可以分别与 客户端通信。The virtual machine 1 and the virtual machine 3 can be mutually active virtual machines. Correspondingly, the host 1 and the host 2 are mutually active and standby nodes. Host 1 and host 2 can communicate with each other through a network interface card (NIC) and can communicate with the client separately.
若主机1为主节点,主机2为备节点,当客户端按照1234的顺序发送四个数据包至主机1时,虚拟机1可以按照1234的顺序处理该四个数据包,并且,主节点1通过Qemu1的一致性协商模块与主机2中的Qemu3的一致性协商模块协商确定虚拟机3处理该四个数据包的顺序为1234,这样,虚拟机1与虚拟机3处理该四个数据包的顺序相同,因此,主节点和备节点只有少量的内存脏页不一致,在同步时只需传输较少的数据。If host 1 is the master node and host 2 is the standby node, when the client sends four data packets to host 1 in the order of 1234, virtual machine 1 can process the four data packets in the order of 1234, and the master node 1 The order of the processing of the four data packets by the virtual machine 3 is determined by the consistency negotiation module of the Qemu1 and the Qemu3 consistency negotiation module in the host 2 to be 1234, so that the virtual machine 1 and the virtual machine 3 process the four data packets. The order is the same, so the primary node and the standby node have only a small amount of memory and dirty pages, and only need to transfer less data when synchronizing.
示例地,一致性协商模块可通过paxos算法实现对数据包处理顺序进行一致性协商,当采用paxos算法时,需在图1中进一步引入观察者节点,其中观察者节点可包括设置有一致性协商模块的Qemu,于下文将会详细介绍。For example, the consistency negotiation module can implement the consistency negotiation of the data packet processing order by using the paxos algorithm. When the paxos algorithm is used, the observer node is further introduced in FIG. 1, wherein the observer node may include a consistency negotiation. The module's Qemu is described in more detail below.
上述计算机系统100仅是举例说明,适用于本申请的计算机系统不限于此,例如,计算机系统100还可以包括其它的主机。此外,不同主机之间可以通过无线电波进行通信,也可以通过以太网进行通信。The above computer system 100 is merely an example, and the computer system applicable to the present application is not limited thereto. For example, the computer system 100 may further include other hosts. In addition, different hosts can communicate via radio waves or communicate over Ethernet.
图2示出了本申请提供的一种虚拟机状态复制的示意图。FIG. 2 is a schematic diagram of a virtual machine state replication provided by the present application.
如图2所示,主备节点的Qemu内均部署有Paxos协商模块(即,一致性协商模块),并且所有虚拟机并行运行相同的数据库程序,来自客户端的数据包到达主节点后,主节点的Paxos模块与其它备节点的Paxos模块协商收到的每一个数据包的处理顺序,使得所有的虚拟机按照相同的顺序处理相同的数据包,这样,备节点与主节点只有少量的内存脏页不一致,从而在同步时只需传输较少的数据即可完成同步,提高了同步的效率。As shown in Figure 2, the Paxos negotiation module (that is, the consistency negotiation module) is deployed in the Qemu of the active and standby nodes, and all virtual machines run the same database program in parallel. After the data packets from the client reach the primary node, the primary node The Paxos module negotiates the processing order of each packet received with the Paxos module of other standby nodes, so that all virtual machines process the same data packet in the same order, so that the standby node and the primary node have only a small amount of memory dirty pages. Inconsistent, so that only a small amount of data can be transferred during synchronization to complete the synchronization, which improves the efficiency of synchronization.
图2中,主节点与备节点运行相同的数据库,填充阴影的内存脏页(也可以称为“脏页”)表示虚拟机2与虚拟机1相异的脏页。In Figure 2, the primary node and the standby node run the same database, and the shaded memory dirty pages (also referred to as "dirty pages") represent dirty pages of virtual machine 2 that differ from virtual machine 1.
应理解,图2所示的Paxos协商模块仅是举例说明,其它一致性算法也适用于本申请。It should be understood that the Paxos negotiation module shown in FIG. 2 is merely an example, and other consistency algorithms are also applicable to the present application.
如背景技术所述,现有技术将Paxos协商模块部署在Qemu内会给虚拟机带来巨大的性能损失,下面将详细描述本申请提供的用于数据同步处理方法是如何解决这一问题的。As described in the prior art, the prior art deploying the Paxos negotiation module in Qemu brings huge performance loss to the virtual machine. How the data synchronization processing method provided by the present application solves this problem will be described in detail below.
图3示出了本申请提供的数据同步处理方法300的流程图。方法300应用于计算机系统中的主节点,具体地,方法300应用于计算机系统中的主节点的Qemu1,该计算机系统还包括与主节点连接的备节点,方法300包括:FIG. 3 shows a flow chart of a data synchronization processing method 300 provided by the present application. The method 300 is applied to a master node in a computer system. Specifically, the method 300 is applied to a Qemu1 of a master node in a computer system. The computer system further includes a standby node connected to the master node, and the method 300 includes:
S310,通过第一线程获取第一待处理信息,第一待处理信息为第一数据包或第一指示信息,第一指示信息用于指示第一数据包,其中,第一线程为执行非线程安全代码的线程。S310, the first to-be-processed information is obtained by the first thread, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet, where the first thread is a non-threading The thread of the security code.
举例而言,第一指示信息可例如为包括指针和数据大小的指示信息,该指针用于指示第一数据包的存储地址,第一线程可通过指针在第一数据包的存储地址读取具有该数据大小的信息来获取第一数据包。For example, the first indication information may be, for example, indication information including a pointer and a data size, the pointer is used to indicate a storage address of the first data packet, and the first thread may be read by the pointer at the storage address of the first data packet. The data size information is used to obtain the first data packet.
S320,通过第一线程将第一待处理信息写入缓冲模块。示例地,缓冲模块为缓冲队列(queue),也可以是用于缓冲第一待处理信息的堆(heap)或栈(stack),还可以是其它用于缓冲第一待处理信息的数据结构,本申请对此不作限定。S320. Write, by the first thread, the first to-be-processed information to the buffer module. For example, the buffer module is a buffer queue, and may also be a heap or a stack for buffering the first to-be-processed information, and may also be another data structure for buffering the first to-be-processed information. This application does not limit this.
S330,通过第二线程对第一待处理信息执行一致性协商处理,该一致性协商处理用于同步主节点和备节点处理第一数据包的顺序。S330. Perform consistency negotiation processing on the first to-be-processed information by using the second thread, where the consistency negotiation process is used to synchronize the order in which the primary node and the standby node process the first data packet.
S340,通过第一线程根据一致性协商处理的结果处理第一数据包。S340. The first data packet is processed by the first thread according to the result of the consistency negotiation process.
方法300中,主节点可以调动第一线程和第二线程执行代码以完成相应的任务,为了简介,在本申请中,有时会将上述行为描述为“通过第一线程完成任务”或“第一线程完成 任务”,例如,“通过第一线程获取第一待处理信息”以及“第一线程获取第一待处理信息”,均可以被理解为,主节点调度第一线程执行代码获取第一待处理信息。In the method 300, the master node may mobilize the first thread and the second thread to execute the code to complete the corresponding task. For the sake of introduction, in the present application, the above behavior is sometimes described as “complete the task by the first thread” or “first”. The thread completes the task, for example, "obtaining the first to-be-processed information by the first thread" and "the first thread acquiring the first to-be-processed information" can be understood as the master node scheduling the first thread to execute the code to obtain the first to-be-processed Process information.
第一线程为执行非线程安全代码的线程,例如,第一线程是Qemu的主循环线程(Qemu main loop),该线程执行Qemu的核心代码,是一个专用的事件处理循环线程,主循环线程根据文件描述符的状态变化调用相应的处理函数处理事件,第二线程是Qemu的工作者线程(Worker thread)。The first thread is a thread that executes non-thread-safe code. For example, the first thread is Qemu's main loop thread (Qemu main loop), which executes Qemu's core code, is a dedicated event processing loop thread, and the main loop thread is based on The state change of the file descriptor calls the corresponding handler to handle the event, and the second thread is Qemu's Worker thread.
由于Qemu的核心代码是非线程安全的,即,Qemu不提供数据访问保护,有可能出现Qemu的多个线程先后更改同一数据造成所得到的数据不一致,因此,第一线程在执行操作时需要占据互斥锁(mutex),例如,主循环线程在获取第一待处理信息之前需要占据全局互斥锁(global mutex),在将第一待处理信息写入缓冲模块后释放全局互斥锁,从而保证同一时刻,只有占据全局互斥锁的主循环线程可以执行获取第一待处理信息并将第一待处理信息写入缓冲模块的操作。Since Qemu's core code is non-thread-safe, that is, Qemu does not provide data access protection. It is possible that multiple threads of Qemu change the same data to cause inconsistency. Therefore, the first thread needs to occupy each other when performing operations. A mutex, for example, the main loop thread needs to occupy a global mutex before acquiring the first to-be-processed information, and release the global mutex after writing the first to-be-processed information to the buffer module, thereby ensuring At the same time, only the main loop thread occupying the global mutex can perform the operation of acquiring the first to-be-processed information and writing the first to-be-processed information to the buffer module.
S310中,第一待处理信息是主节点获得的任意一个待处理的信息,第一待处理信息可以是一个数据包,也可以是一个用于指示该数据包的描述符(即,指示信息)。例如,主节点接收到来自客户端的数据包后,可以直接将该数据包写入缓冲模块,也可以生成一个指示该数据包的描述符,将该描述符写入缓冲模块,其中,该描述符可以包括指向该数据包的指针、以及指示该数据包的长度和类型的信息。In S310, the first to-be-processed information is any information to be processed obtained by the master node, and the first to-be-processed information may be a data packet, or may be a descriptor for indicating the data packet (ie, indication information). . For example, after receiving the data packet from the client, the master node may directly write the data packet to the buffer module, or may generate a descriptor indicating the data packet, and write the descriptor to the buffer module, where the descriptor is A pointer to the packet can be included, along with information indicating the length and type of the packet.
除上述示例外,第一数据包还可以是主节点在本地生成的数据包。本申请对第一数据包的具体内容以及主节点获取第一数据包的方法不作限定。In addition to the above examples, the first data packet may also be a data packet generated locally by the primary node. The application does not limit the specific content of the first data packet and the method for the primary node to acquire the first data packet.
第一线程获取第一待处理信息之后,将第一待处理信息写入缓冲模块。After the first thread acquires the first to-be-processed information, the first to-be-processed information is written to the buffer module.
第一线程将第一待处理信息写入缓冲模块后即可释放全局互斥锁,其它线程可以尝试占据全局互斥锁并执行其它任务。第二线程读取缓冲模块中的至少一个待处理信息,并基于一致性协商协议(例如Paxos)确定主备节点处理数据包的共同顺序,随后,第一线程占据全局互斥锁并按照第二线程确定的处理顺序处理数据包。第二线程例如是一致性协商线程。After the first thread writes the first pending information to the buffer module, the global mutex can be released, and other threads can try to occupy the global mutex and perform other tasks. The second thread reads at least one to-be-processed information in the buffer module, and determines a common order in which the active and standby nodes process the data packets based on a consistency negotiation protocol (eg, Paxos), and then the first thread occupies the global mutex and follows the second The processing sequence determined by the thread processes the packet. The second thread is, for example, a consensus negotiation thread.
S330中,一致性协商的具体流程可以参考现有技术中的一致性协商方法,为了简介,在此不再赘述。For the specific process of the consistency negotiation, refer to the consistency negotiation method in the prior art. For the sake of introduction, details are not described herein again.
S340中,第一线程可以根据第一数据包的类型处理第一数据包,例如,当第一数据包为客户端发送的数据包时,第一线程可以将第一数据包发送至主节点上的虚拟机进行处理,当第一数据包为主节点的同步模块生成的请求进行主备节点同步的请求数据包时,主节点可以根据该请求数据包进行主备节点的同步操作。In S340, the first thread may process the first data packet according to the type of the first data packet. For example, when the first data packet is a data packet sent by the client, the first thread may send the first data packet to the primary node. The virtual machine performs processing. When the first data packet is requested by the synchronization module of the primary node to perform the request data packet synchronized by the primary and secondary nodes, the primary node may perform the synchronous operation of the active and standby nodes according to the request data packet.
在上述实施例中,由于主备节点进行一致性协商的工作是由第二线程完成的,第二线程进行一致性协商工作时无需占据全局互斥锁,因此,主节点可以在进行主备虚拟机的同步操作时利用主节点的虚拟机处理其它任务,提高了主节点的性能。In the above embodiment, since the work of the consistency negotiation between the active and standby nodes is performed by the second thread, the second thread does not need to occupy the global mutex when performing the consistency negotiation work. Therefore, the master node can perform the active and standby virtual The synchronous operation of the machine utilizes the virtual machine of the primary node to process other tasks, improving the performance of the primary node.
并且,由于主节点和备节点处理所述第一数据包的顺序相同,因此,可保证数据库1和数据库2的执行相同顺序的访问,可最大限度地减少主备节点的脏页差异,降低主备同步时需要传输的脏页数量。Moreover, since the order in which the primary node and the standby node process the first data packet is the same, the database 1 and the database 2 can be guaranteed to perform the same order of access, thereby minimizing the difference of the dirty pages of the active and standby nodes, and reducing the main The number of dirty pages that need to be transferred when preparing for synchronization.
作为一个可选的实施例,S330包括:As an optional embodiment, S330 includes:
S331,通过第二线程从缓冲模块中读取第一待处理信息。S331. The first pending information is read from the buffer module by using the second thread.
S332,通过第二线程对第一待处理信息执行一致性协商处理,确定第一数据包的被处理顺序。S332. Perform a consistency negotiation process on the first to-be-processed information by using the second thread to determine a processed sequence of the first data packet.
S333,根据第一数据包的被处理顺序通过第二线程将第一待处理信息写入管道,该管道用于第一线程读取第一待处理信息。S333. Write, by the second thread, the first to-be-processed information to the pipeline according to the processed order of the first data packet, where the pipeline is used by the first thread to read the first to-be-processed information.
由于Qemu的核心程序是非线程安全的,因此,在Qemu中,第二线程作为一个工作者线程不能直接调用主节点的程序代码,本实施例提供的一致性协商处理方案在第一线程和第二线程之间建立一个用于联系的管道,并将该管道添加到Qemu主循环线程的事件循环列表中,当第二线程有消息需要通知Qemu主循环线程时,第二线程在文件描述符上执行一个写操作,使得该文件描述符在Qemu主循环线程一端变成可读,Qemu主循环线程读取该文件描述符之后即可调用相应的程序执行后续处理。Since the core program of Qemu is non-thread-safe, in Qemu, the second thread as a worker thread cannot directly call the program code of the master node. The consistency negotiation processing scheme provided in this embodiment is in the first thread and the second thread. Build a pipe for the connection between the threads, and add the pipe to the event loop list of the Qemu main loop thread. When the second thread has a message to notify the Qemu main loop thread, the second thread executes on the file descriptor. A write operation causes the file descriptor to become readable at the end of the Qemu main loop thread. After the Qemu main loop thread reads the file descriptor, the corresponding program can be called to perform subsequent processing.
例如,S332执行完毕之后,对于第一数据包的被处理顺序,主节点和备节点已经达成一致,此时,直接的方法是第二线程执行虚拟网卡处理代码(RTL8139_do_receiver)对第一数据包进行虚拟网卡的逻辑操作,然而,RTL8139虚拟网卡的处理代码是非线性安全的代码,为了保证主节点的数据的安全性,第二线程可以将第一数据包的描述符写入管道,并在该描述符上执行一个写操作,使得该文件描述符在Qemu主循环线程那一端变的可读,在Qemu主循环线程读取该描述符之后,调用虚拟网卡处理代码对第一数据包执行后续处理。因此,上述实施例可以在保证主节点的线程安全性的前提下完成一致性协商后的处理任务。For example, after S332 is executed, the master node and the standby node have agreed on the processed sequence of the first data packet. At this time, the direct method is that the second thread executes the virtual network card processing code (RTL8139_do_receiver) to perform the first data packet. The logical operation of the virtual network card, however, the processing code of the RTL8139 virtual network card is a non-linear security code. In order to ensure the security of the data of the primary node, the second thread can write the descriptor of the first data packet into the pipeline, and in the description A write operation is performed to make the file descriptor readable at the end of the Qemu main loop thread. After the Qemu main loop thread reads the descriptor, the virtual network card processing code is called to perform subsequent processing on the first data packet. Therefore, the foregoing embodiment can complete the processing task after the consistency negotiation under the premise of ensuring the thread security of the master node.
作为一个可选的实施例,S331包括:As an optional embodiment, S331 includes:
S3311,在预设时间通过第二线程从缓冲模块中读取第一待处理信息。S3311: The first to-be-processed information is read from the buffer module by using the second thread at a preset time.
在本实施例中,预设时间例如是定时器事件对应的时间,第二线程可以基于定时器事件的触发从缓冲模块中读取第一待处理信息,主节点可以设置不同的定时器事件,因此,上述实施例可以灵活触发第二线程进行一致性协商处理。In this embodiment, the preset time is, for example, a time corresponding to the timer event, and the second thread may read the first to-be-processed information from the buffer module based on the trigger of the timer event, and the master node may set different timer events. Therefore, the foregoing embodiment can flexibly trigger the second thread to perform the consistency negotiation process.
作为一个可选的实施例,S331之前,方法300还包括:As an optional embodiment, before S331, the method 300 further includes:
S3301,通过第二线程获取缓冲模块的独占权限,缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问缓冲模块。S3301: Obtain exclusive permission of the buffer module by using the second thread, and the exclusive permission of the buffer module is used to prohibit two or more threads from accessing the buffer module at the same time.
S332之后,方法300还包括:After S332, method 300 further includes:
S3321,当缓冲模块中待处理信息的数量为0时,释放第二线程获取的缓冲模块的独占权限。S3321: When the number of pieces of information to be processed in the buffer module is 0, releasing the exclusive permission of the buffer module acquired by the second thread.
例如,当第二线程开始工作时,首先获取缓冲队列的独占权限,该独占权限也可以称为队列互斥锁,用于禁止两个或两个以上的线程在同一时刻访问(包括写入和/或读取)缓冲队列。当缓冲队列中的待处理信息数量为0时第二线程释放队列互斥锁,其它线程可以继续向缓冲队列中写入新的待处理信息。上述实施例可以避免新的待处理信息插入已经完成一致性协商处理的待处理信息队列中,从而提高了一致性协商处理的可靠性和效率。For example, when the second thread starts working, first obtain the exclusive permission of the buffer queue, which may also be called a queue mutex, for prohibiting two or more threads from accessing at the same time (including write and / or read) buffer queue. When the number of pending information in the buffer queue is 0, the second thread releases the queue mutex, and other threads can continue to write new pending information to the buffer queue. The foregoing embodiment can prevent the new pending information from being inserted into the to-be-processed information queue that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
作为一个可选的实施例,S332包括:As an optional embodiment, S332 includes:
S3321,通过第二线程确定缓冲模块中待处理信息的数量。S3321. Determine, by the second thread, the quantity of information to be processed in the buffer module.
S3322,当缓冲模块中待处理信息的数量大于0时,通过第二线程将待处理信息对应的数据包(包括第一数据包)写入一致性日志并删除缓冲模块中的待处理信息,一致性日志用于缓存数据包,一致性日志中的数据包的先后顺序与一致性日志中的数据包的被处理 顺序相对应。S3322, when the quantity of the information to be processed in the buffer module is greater than 0, the data packet corresponding to the to-be-processed information (including the first data packet) is written into the consistency log by the second thread, and the to-be-processed information in the buffer module is deleted. The sex log is used to cache the data packets, and the order of the data packets in the consistency log corresponds to the processing order of the data packets in the consistency log.
S3323,通过第二线程发送包括第一数据包的一致性协商请求,该一致性协商请求用于请求备节点接受第一数据包的被处理顺序。S3323: Send, by using the second thread, a consistency negotiation request that includes the first data packet, where the consistency negotiation request is used to request the standby node to accept the processed sequence of the first data packet.
S3324,通过第二线程接收协商完成消息,该协商完成消息用于指示第一数据包的被处理顺序已被接受。S3324. The negotiation completion message is received by the second thread, where the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
当定时器事件或I/O事件触发时,第二线程先占据队列互斥锁,随后查看缓冲队列是否为空。如果缓冲队列为空,则释放队列互斥锁;如果缓冲队列非空,第二线程依次读取队列里的成员(数据包或者数据包的描述符),将成员所对应的数据包插入Paxos协议的一致性日志,然后将成员从队列里删除并释放原数据包所占用的内存,第二线程一直读取到队列为空,随后释放队列互斥锁。释放队列互斥锁后,第二线程将一致性日志中的数据包按顺序发送至备节点,请求备节点按照该顺序处理一致性日志中的数据包,随后,当第二线程接收到来自备节点的协商完成消息时,确定一致性日志中的数据包的被处理顺序已被备节点接受。When a timer event or an I/O event is triggered, the second thread first occupies the queue mutex and then checks to see if the buffer queue is empty. If the buffer queue is empty, the queue mutex is released; if the buffer queue is not empty, the second thread sequentially reads the members in the queue (data packet or packet descriptor), and inserts the packet corresponding to the member into the Paxos protocol. The consistency log, then removes the member from the queue and releases the memory occupied by the original packet. The second thread reads until the queue is empty, and then releases the queue mutex. After the queue mutex is released, the second thread sends the data packets in the consistency log to the standby node in sequence, requesting the standby node to process the data packets in the consistency log according to the sequence, and then, when the second thread receives the data from the standby When the node completes the negotiation message, it is determined that the processed sequence of the data packets in the consistency log has been accepted by the standby node.
在本实施例中,第二线程读取待处理信息后执行一致性协商处理,随后删除缓冲模块中的待处理信息,这样,可以保证第二线程每次读取的缓冲模块中的指示信息都是未处理过的待处理信息,避免第二线程读到被处理过的待处理信息,从而提高了一致性协商处理的效率。In this embodiment, the second thread executes the consistency negotiation process after reading the to-be-processed information, and then deletes the to-be-processed information in the buffer module, so that the indication information in the buffer module that the second thread reads each time can be ensured. It is unprocessed information to be processed, and the second thread is prevented from reading the processed information to be processed, thereby improving the efficiency of the consistency negotiation process.
作为一个可选的实施例,S320之前,方法300还包括:As an optional embodiment, before S320, the method 300 further includes:
S319,通过第一线程获取缓冲模块的独占权限,缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问缓冲模块。S319: Acquire exclusive permission of the buffer module by using the first thread, and the exclusive permission of the buffer module is used to prohibit two or more threads from accessing the buffer module at the same time.
S320之后,方法300还包括:After S320, method 300 further includes:
S321,释放第一线程获取的缓冲模块的独占权限。S321: Release the exclusive permission of the buffer module acquired by the first thread.
第一线程在写入缓冲模块之前,首先占据缓冲模块的独占权限,该独占权限也可以称为队列互斥锁,用于禁止两个或两个以上的线程在同一时刻访问缓冲模块。当第一线程写入缓冲模块完成后释放队列互斥锁,第二线程可以占据队列互斥锁并读取缓冲模块中的待处理信息。上述实施例可以避免新的待处理信息插入已经完成一致性协商处理的待处理信息的队列中,从而提高了一致性协商处理的可靠性和效率。Before the first thread writes to the buffer module, it first occupies the exclusive permission of the buffer module. The exclusive permission may also be called a queue mutual exclusion lock, which is used to prohibit two or more threads from accessing the buffer module at the same time. When the first thread write buffer module completes releasing the queue mutex lock, the second thread can occupy the queue mutex lock and read the pending information in the buffer module. The foregoing embodiment can prevent the new pending information from being inserted into the queue of the information to be processed that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
作为一个可选的实施例,第一虚拟设备中运行有主数据库,备节点设置有第二虚拟设备,第二虚拟设备中运行有备用数据库,第一数据包携带有客户端发送至主节点的针对主数据库的访问请求,As an optional embodiment, the first virtual device runs a primary database, the standby node is configured with a second virtual device, and the second virtual device runs a standby database, where the first data packet carries the client and sends the data to the primary node. Access requests to the primary database,
S310包括:通过第一线程从主节点的物理网卡获取第一待处理信息。S310 includes: acquiring, by the first thread, the first to-be-processed information from the physical network card of the primary node.
S340包括:通过第一线程将第一数据包同时发送至主数据库和备用数据库,以使得主节点和备节点处理按照相同的顺序处理第一数据包。S340 includes: transmitting, by the first thread, the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node process process the first data packet in the same order.
客户端发送的第一数据包通过主节点的物理网卡到达主节点的Qemu,在经过主节点的一致性协商处理之后,第一数据包被主节点和备节点按照相同的处理顺序处理,从而提高了主备节点的相同脏页比例。The first data packet sent by the client reaches the Qemu of the master node through the physical network card of the master node. After the consistency negotiation process of the master node, the first data packet is processed by the master node and the standby node in the same processing order, thereby improving The same dirty page ratio of the active and standby nodes.
作为一个可选的实施例,方法300还包括:As an optional embodiment, the method 300 further includes:
S301,通过模拟器的第三线程获取n次同步操作时主节点的负载阈值和主节点与备节点的相同脏页比例,该n次同步操作时主节点的负载阈值为c 1,…,c n,该n次同步操作 时主节点与备节点的相同脏页比例为w 1,…,w n,其中,c 1与w 1对应,…,c n与w n对应,n为大于或等于2的正整数。 S301. Acquire, by the third thread of the simulator, the load threshold of the primary node and the same dirty page ratio of the primary node and the standby node when the synchronization operation is performed n times, and the load threshold of the primary node during the n synchronization operations is c 1 ,...,c n, same dirty pages ratio of the primary node and the standby node when the n-th synchronization operation is w 1, ..., w n, where, c 1 and w 1 corresponds, ..., c n and W n corresponds, n is greater than or equal to A positive integer of 2.
S302,通过第三线程确定w m,w m为该n次同步操作之后的当前时刻的负载阈值,w m=[(c 1×w 1)+...+(c n×w n)]÷n,m为正整数。 S302, determining, by the third thread, w m , w m is a load threshold of the current time after the n times of synchronization operations, w m =[(c 1 ×w 1 )+...+(c n ×w n )] ÷n, m is a positive integer.
S303,通过第三线程获取L m,L m为当前时刻主节点的负载值。 S303, obtaining a third thread through L m, L m is the time value of the current load of the master node.
S304,若L m≤w m,则通过第三线程生成同步请求,该同步请求用于请求同步主节点和备节点的脏页。 S304. If L m ≤ w m , generate a synchronization request by the third thread, where the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node.
S305,通过第三线程将该同步请求写入缓冲模块。S305. Write the synchronization request to the buffer module by using a third thread.
S306,通过第二线程对该同步请求执行一致性协商处理,对该同步请求执行一致性协商处理的结果用于同步主节点和备节点处理同步请求的顺序。S306. Perform a consistency negotiation process on the synchronization request by using the second thread, and perform a consistency negotiation process on the synchronization request to synchronize the order in which the primary node and the standby node process the synchronization request.
S307,通过第一线程根据对该同步请求执行一致性协商处理的结果处理该同步请求。S307. The synchronization request is processed by the first thread according to a result of performing a consistency negotiation process on the synchronization request.
现有技术中,主节点根据当前时刻的负载值与固定的负载阈值比较确定是否启动主备节点的同步,若当前时刻的负载值小于固定的负载阈值,则不启动主备节点的同步;若当前时刻的负载值大于或等于固定的负载阈值,则启动主备节点的同步。上述现有技术的缺点是根据固定的负载阈值很难确定启动主备节点同步的最佳启动时机,原因在于:若固定的负载阈值设定过小,例如将固定的负载阈值设置为0,虽然主节点的负载值满足该条件时主备节点的相同脏页比例最高(因为此时主备节点的虚拟机都不再工作,脏页不再发生变化),但是主备节点的虚拟机从负载检测的时刻到数据同步时刻之间一直处于闲置状态,导致主备节点的虚拟机资源被浪费;若固定的负载阈值设定过大,则主备节点的虚拟机在数据同步时仍处于工作状态,主备节点的虚拟机相同脏页比例较小,导致主备节点需要传输较多的数据(即,相异的脏页对应的数据),从而导致主备节点的数据同步需要消耗较多的网络资源。In the prior art, the master node determines whether to synchronize the synchronization between the active and standby nodes according to the current load value and the fixed load threshold. If the current load value is less than the fixed load threshold, the synchronization between the active and standby nodes is not started. When the load value at the current time is greater than or equal to the fixed load threshold, the synchronization of the active and standby nodes is started. The above prior art has the disadvantage that it is difficult to determine the optimal starting timing for starting the synchronization of the active and standby nodes according to the fixed load threshold, because if the fixed load threshold is set too small, for example, the fixed load threshold is set to 0, although When the load value of the master node meets the condition, the ratio of the same dirty pages of the active and standby nodes is the highest (because the virtual machines of the active and standby nodes are no longer working, the dirty pages are no longer changed), but the virtual machines of the primary and secondary nodes are loaded. The virtual machine resources of the active and standby nodes are wasted when the time between the detection and the data synchronization is idle. If the fixed load threshold is set too large, the virtual machines of the active and standby nodes are still working when the data is synchronized. The virtual machine of the active and standby nodes has a small proportion of the same dirty pages. As a result, the active and standby nodes need to transmit more data (that is, data corresponding to different dirty pages), which results in more data synchronization between the active and standby nodes. Internet resources.
下面举一个例子说明本申请提供的技术方案是如何解决上述问题的。例如,主虚拟机1从启动到进行第一次负载检测处理器之间的工作时间为10分钟,主备节点的虚拟机相同脏页比例为80%,主虚拟机1从启动到第二次负载检测之间处理器工作时间为20分钟,主备节点的虚拟机相同脏页比例为85%,主虚拟机1从启动到第三次负载检测之间的处理器工作时间为20分钟,主备节点的虚拟机相同脏页比例为85%。上述数据说明主节点的虚拟机1至少在第二次负载检测时已经处于空闲状态,有可能在第二次负载检测之前主节点的虚拟机1就已经处于空闲状态,若在第二次负载检测后开始数据同步,则必然造成主节点的虚拟机1被闲置一段时间,虚拟机资源被浪费,因此,优选的主备节点同步时机在第一次负载检测之后以及第二次负载检测之前,在这个时间段内,选择主节点的虚拟机1已完成了大部分工作或者全部工作的时间点开始进行主备同步,这样做可以取得虚拟机资源利用率与相同脏页比例的较佳平衡点。The following is an example to illustrate how the technical solution provided by the present application solves the above problems. For example, the working time between the primary virtual machine 1 and the first load detection processor is 10 minutes, the virtual machine of the primary and secondary nodes has the same dirty page ratio of 80%, and the primary virtual machine 1 is started to the second time. The processor working time is 20 minutes between load detection, the same dirty page ratio of the virtual machine of the active and standby nodes is 85%, and the processor working time between the primary virtual machine 1 and the third load detection is 20 minutes. The virtual machine of the standby node has the same dirty page ratio of 85%. The above data indicates that the virtual machine 1 of the primary node is already in an idle state at least during the second load detection. It is possible that the virtual machine 1 of the primary node is already in an idle state before the second load detection, and if the second load is detected. After the data synchronization is started, the virtual machine 1 of the primary node is idle for a period of time, and the virtual machine resources are wasted. Therefore, the preferred synchronization time of the primary and secondary nodes is after the first load detection and before the second load detection. During this time period, when the virtual machine 1 of the master node has completed most of the work or all the work, the master-slave synchronization starts, which can obtain a better balance point between the virtual machine resource utilization and the same dirty page ratio.
按照上述本申请提供的实施例,根据至少两次同步操作时主节点的负载阈值和相同脏页比例确定负载阈值,例如,将第一次负载检测时获得的相同脏页比例80%作为加权值乘以负载阈值5得到结果4,将第二次负载检测时获得的相同脏页比例85%作为加权值乘以负载阈值6得到结果5.1,将4和5.1相加再除以负载检测次数2得到该两次负载检测的负载阈值的加权平均值4.55,该加权平均值即为新的负载阈值。若第三次负载检测处理器的工作时间为22分钟,则第三次负载检测时主节点的负载值为2(第三次负载检测得到 的工作时间22减去第二次负载检测得到的工作时间20得到第三次负载检测时主节点的负载值2),该负载值小于上述新的负载阈值4.55,说明主节点的虚拟机的剩余任务不多了,主节点的虚拟机很快就会进入闲置状态,则开始执行数据同步操作;若第三次负载检测得到的处理器工作时间为30,则第三次负载检测时主节点的负载值为10(第三次负载检测得到的工作时间30减去第二次负载检测得到的工作时间20得到第三次负载检测时主节点的负载值2),该负载值大于上述新的负载阈值4.55,说明主节点的虚拟机的剩余任务还有很多,主备节点的相同脏页比例较小,则在当前时刻不执行数据同步操作,待第四次负载检测后,根据第四次负载检测的确定的负载值与新的负载阈值4.55的大小关系确定是否执行数据同步操作。According to the embodiment provided by the present application, the load threshold is determined according to the load threshold of the primary node and the same dirty page ratio at least two synchronization operations, for example, the same dirty page ratio 80% obtained at the first load detection is used as the weight value. Multiply the load threshold value 5 to obtain the result 4, and the same dirty page ratio 85% obtained at the second load detection is multiplied by the load threshold value 6 to obtain the result 5.1, and the sum of 4 and 5.1 is divided by the load detection number 2 to obtain The weighted average of the load thresholds for the two load tests is 4.55, which is the new load threshold. If the working time of the third load detection processor is 22 minutes, the load value of the master node in the third load detection is 2 (the working time 22 obtained by the third load detection minus the work obtained by the second load detection) Time 20 obtains the load value of the master node during the third load detection 2), and the load value is less than the new load threshold of 4.55, indicating that the remaining tasks of the virtual machine of the primary node are not much, and the virtual machine of the primary node will soon be When the idle state is entered, the data synchronization operation is started; if the processor working time obtained by the third load detection is 30, the load value of the primary node is 10 when the third load is detected (the working time obtained by the third load detection) 30 minus the working time 20 obtained by the second load detection to obtain the load value of the master node in the third load detection 2), the load value is greater than the new load threshold 4.55, indicating that the remaining tasks of the virtual machine of the master node are still Many, if the proportion of the same dirty page of the active and standby nodes is small, the data synchronization operation will not be performed at the current time. After the fourth load detection, the determined load value according to the fourth load detection. The new load threshold is determined whether the magnitude relationship 4.55 data synchronization operations.
此外,由于本实施例中新的负载阈值是根据多次负载测量的结果确定的加权平均值,因此,新的负载阈值将随着负载检测次数的增多逐渐收敛为一个较优选的负载阈值。上述第三线程例如是主节点中主备同步模块的工作者线程,即,负责主备虚拟机同步的线程。In addition, since the new load threshold in this embodiment is a weighted average determined according to the result of the multiple load measurement, the new load threshold will gradually converge to a more preferable load threshold as the number of load detections increases. The third thread is, for example, a worker thread of the active/standby synchronization module in the master node, that is, a thread responsible for synchronization of the active and standby virtual machines.
综上,本实施例提供的数据同步处理方法中,负载阈值为一个动态的较优选的阈值,可以在进行数据同步时使得虚拟机资源利用率与主备节点的相同脏页比例达到一个较佳的平衡点。In summary, in the data synchronization processing method provided by the embodiment, the load threshold is a dynamic and preferable threshold, and the virtual machine resource utilization ratio and the same dirty page ratio of the active and standby nodes are better when data synchronization is performed. The balance point.
作为一个可选的实施例,S301之前,方法300还包括:As an optional embodiment, before S301, the method 300 further includes:
S3001,通过第三线程获取SUM k,SUM k为主节点的第一次负载测量得到的负载值至第k次负载测量得到的负载值的和,k为正整数。 S3001: Acquire, by the third thread, SUM k , SUM k is the sum of the load value obtained by the first load measurement of the primary node to the load value obtained by the kth load measurement, and k is a positive integer.
S3002,当k≥T count时,通过第三线程确定c 0,T count为负载测量次数阈值,c 0为主节点首次同步操作的负载阈值,c 0=SUM k÷k。或者, S3002, when k≥T count , determine c 0 by the third thread, T count is the load measurement threshold, and c 0 is the load threshold of the first synchronization operation of the primary node, and c 0 =SUM k ÷k. or,
S3003,当k<T count时,通过第三线程获取L k+1,L k+1为第k+1次负载测量得到的主节点的负载值,T count为负载测量次数阈值;S3004,通过第三线程获取SUM k+1,SUM k+1=SUM k+L k+1;S3005,当k+1≥T count时,通过第三线程确定c 0,c 0为主节点首次同步操作的负载阈值,c 0=SUM k+1÷(k+1)。 S3003, when k<T count , obtain L k+1 through the third thread, L k+1 is the load value of the primary node obtained by the k+1th load measurement, and T count is the threshold of the load measurement times; S3004, The third thread obtains SUM k+1 , SUM k+1 =SUM k +L k+1 ; S3005, when k+1≥T count , determines c 0 , c 0 as the first synchronization operation of the main node by the third thread. Load threshold, c 0 =SUM k+1 ÷(k+1).
例如,主节点的虚拟机启动后,负载的测量次数(COUNT)等于0,第一次负载测量后,得到第一负载值L 1,则SUM 1等于L 1,第二次负载测量后,得到第二负载值L 2,则SUM 2=SUM 1+L 2,即,SUM 2与SUM 1正相关,且SUM 2与L 2正相关。若测量次数阈值T count为2,则初始负载阈值c 0等于SUM 2除以2,即,初始负载阈值与SUM 2正相关,且初始负载阈值与测量次数负相关;若测量次数阈值T count为3,由于当前仅进行了两次负载测量,因此,还需要再进行一次负载测量,即,确定第三负载值L 3,随后计算SUM 3,SUM 3=SUM 2+L 3,即,SUM3与SUM 2正相关,且SUM 3与L 3正相关,初始负载阈值c 0等于SUM 3除以3,即,初始负载阈值与SUM 3正相关,且初始负载阈值与测量次数负相关。 For example, after the virtual machine of the primary node is started, the number of measurements (COUNT) of the load is equal to 0. After the first load measurement, the first load value L 1 is obtained , then SUM 1 is equal to L 1 , and after the second load measurement, a second load value L 2, the SUM 2 = SUM 1 + L 2 , i.e., SUM 1 and SUM 2 positive correlation, and SUM 2 L 2 with a positive correlation. If the measurement number threshold T count is 2, the initial load threshold c 0 is equal to SUM 2 divided by 2, that is, the initial load threshold is positively correlated with SUM 2 , and the initial load threshold is negatively correlated with the number of measurements; if the measurement number threshold T count is 3. Since only two load measurements have been performed so far, it is necessary to perform another load measurement, that is, to determine the third load value L 3 , and then calculate SUM 3 , SUM 3 =SUM 2 +L 3 , ie, SUM3 and SUM 2 is positively correlated, and SUM 3 is positively correlated with L 3 , the initial load threshold c 0 is equal to SUM 3 divided by 3, ie, the initial load threshold is positively correlated with SUM 3 and the initial load threshold is inversely correlated with the number of measurements.
S3001中,当k=1时,SUM 1等于第一次负载测量得到的负载值。 In S3001, when k=1, SUM 1 is equal to the load value obtained by the first load measurement.
上述实施例可以确定一个初始负载阈值,从而可以确定主节点首次进行数据同步的时机。The above embodiment can determine an initial load threshold so that the timing at which the primary node synchronizes data for the first time can be determined.
作为一个可选的实施例,上述主节点的负载值包括处理器负载值和存储器负载值,上述主节点的负载阈值包括处理器负载阈值和存储器负载阈值。As an optional embodiment, the load value of the primary node includes a processor load value and a memory load value, and the load threshold of the primary node includes a processor load threshold and a memory load threshold.
当主节点的处理器和存储器均空闲时,则可判断主节点空闲。When the processor and memory of the master node are both idle, it can be determined that the master node is idle.
在本实施例中,可以先比较处理器负载值与处理器负载阈值的大小关系,再比较存储器负载值与存储器负载阈值的大小关系,也可以先比较存储器负载值与存储器负载阈值的大小关系,再比较处理器负载值与处理器负载阈值的大小关系,从而可以灵活确定主备节点进行数据同步的时机。In this embodiment, the relationship between the processor load value and the processor load threshold may be compared first, and then the relationship between the memory load value and the memory load threshold may be compared, or the relationship between the memory load value and the memory load threshold may be compared first. Then compare the relationship between the processor load value and the processor load threshold, so that the timing of data synchronization between the active and standby nodes can be flexibly determined.
为了进一步说明主备节点如何进行数据同步,下面将结合图4至图6详细描述本申请提供的数据同步方法。To further explain how the active and standby nodes perform data synchronization, the data synchronization method provided by the present application will be described in detail below with reference to FIGS. 4 to 6.
如图4所示,主虚拟机为运行在主节点上的虚拟机,备虚拟机为运行在备节点上的虚拟机,主备虚拟机同步即主备节点的数据同步。其中,As shown in Figure 4, the primary virtual machine is a virtual machine running on the primary node, and the standby virtual machine is a virtual machine running on the standby node. The primary and secondary virtual machines are synchronized, that is, the data of the active and standby nodes is synchronized. among them,
T0-T1:主虚拟机与备虚拟机运行,并记录脏页列表。T0-T1: The primary virtual machine runs with the standby virtual machine and records a list of dirty pages.
T1-T2:主虚拟机与备虚拟机停止运行,各自计算脏页的哈希值。T1-T2: The primary virtual machine and the standby virtual machine stop running, and each computes a hash value of the dirty page.
T2-T3:主虚拟机比较备虚拟机脏页的哈希值。T2-T3: The primary virtual machine compares the hash value of the dirty page of the standby virtual machine.
T3-T4:主虚拟机传输相异的脏页至备份虚拟机。T3-T4: The primary virtual machine transfers the different dirty pages to the backup virtual machine.
传输完毕后,主虚拟机释放缓冲的网络输出(相异的脏页数据),并恢复运行,备虚拟机恢复运行。After the transfer is complete, the primary virtual machine releases the buffered network output (different dirty page data) and resumes operation, and the standby virtual machine resumes operation.
在上述流程中,T1为进行主备虚拟机同步的时刻,作为一个可选的实施例,图5示出了一种触发主备虚拟机同步的方法流程。In the foregoing process, T1 is a time for performing synchronization between the active and standby virtual machines. As an optional embodiment, FIG. 5 shows a method flow for triggering synchronization of the active and standby virtual machines.
该方法500包括:The method 500 includes:
S510,记录主备虚拟机同步时的相同脏页比例及负载阈值。S510. Record the same dirty page ratio and load threshold when the active and standby virtual machines are synchronized.
主备虚拟机的同步模块记录每次同步时主虚拟机和备虚拟机的相同内存脏页比例,及触发主备虚拟机同步时的CPU负载阈值和磁盘(input/output,I/O)负载阈值。The synchronization module of the active and standby virtual machines records the same memory dirty page ratio of the primary virtual machine and the standby virtual machine at each synchronization, and the CPU load threshold and disk (input/output, I/O) load when the primary and backup virtual machines are synchronized. Threshold.
S520:以相同脏页比例为权重,对近几次阈值进行加权平均,得到下次主备虚拟机同步的负载阈值。S520: weighting the similar thresholds by using the same dirty page ratio as a weight, and obtaining a load threshold for synchronization of the next active and standby virtual machines.
根据相同脏页比例,赋予阈值相应的权重。然后将最近n次负载阈值乘以其权重,求和得到总体值,再除以n,得到下次触发主备虚拟机同步的负载阈值。The corresponding weight is given to the threshold according to the same dirty page ratio. Then multiply the last n load thresholds by their weights, sum the total values, and divide by n to get the load threshold for triggering the synchronization of the active and standby virtual machines next time.
下面以CPU为例进行说明:在第j次主备虚拟机同步结束后,主虚拟机主备同步模块根据第j次主备虚拟机同步时的相同脏页比例w j,赋予这次CPU阈值c j相应的权重w j。将最近n次各CPU阈值乘以相应的权重,然后求和得到总体值,再除以n,得到启动第j+1次主备虚拟机同步时CPU所需达到的负载值,即
Figure PCTCN2018082225-appb-000001
磁盘I/O负载阈值的调整过程同理。
The following takes the CPU as an example. After the synchronization of the jth primary and backup virtual machines is completed, the primary and secondary active and standby synchronization modules of the primary virtual machine are given the CPU threshold according to the same dirty page ratio w j when the jth primary and secondary virtual machines are synchronized. c j corresponding weight w j . Multiply the CPU thresholds of the last n times by the corresponding weights, then sum and get the overall value, and divide by n to get the load value that the CPU needs to achieve when starting the j+1th primary and backup virtual machine synchronization, that is,
Figure PCTCN2018082225-appb-000001
The adjustment process of the disk I/O load threshold is the same.
其中,初始阈值的计算过程如图6所示:Among them, the initial threshold calculation process is shown in Figure 6:
主备同步模块启动后,初始负载累加值SUM 0=0,初始负载值CPU_Tick_A0等于0,计数值COUNT=0,计数值为主虚拟机进行负载测量的次数。 After the active/standby synchronization module is started, the initial load accumulated value SUM 0 =0, the initial load value CPU_Tick_A0 is equal to 0, the count value is COUNT=0, and the count value is the number of load measurements performed by the primary virtual machine.
然后获取A1时刻CPU的使用时间CPU_Tick_A1,此时,SUM 1=SUM 0+(CPU_Tick_A1-CPU_Tick_A0),等待Δt时间之后获取A2时刻CPU的使用时间CPU_Tick_A2。 Then, the CPU usage time CPU_Tick_A1 at time A1 is obtained. At this time, SUM 1 = SUM 0 + (CPU_Tick_A1 - CPU_Tick_A0), and the CPU usage time CPU_Tick_A2 at time A2 is acquired after waiting for Δt time.
根据SUM k+1=SUM k+L k+1计算A2时刻的负载累加值SUM2,SUM2=SUM1+(CPU_Tick_A2-CPU_Tick_A1),并判断COUNT是否大于或等于计数值3,该计数值即为负载测量次数阈值,如果是的话计算初始负载阈值c 0,c 0=SUM k+1/COUNT,否则继续进行负载测量,直至负载测量次数大于或等于负载测量次数阈值为止。 Calculate the load accumulated value SUM2 at time A2 according to SUM k+1 =SUM k +L k+1 , SUM2=SUM1+(CPU_Tick_A2-CPU_Tick_A1), and determine whether COUNT is greater than or equal to the count value of 3. The count value is the number of load measurements. The threshold, if yes, calculates the initial load threshold c 0 , c 0 =SUM k+1 /COUNT, otherwise the load measurement continues until the load measurement count is greater than or equal to the load measurement threshold.
S530:获取当前负载值,将当前负载值与设定的负载阈值进行比较,确定是否进行同 步。S530: Acquire a current load value, compare the current load value with a set load threshold, and determine whether synchronization is performed.
主备同步模块获取虚拟机的工作负载,与阈值进行比较并启动同步,过程如下:The active/standby synchronization module obtains the workload of the virtual machine, compares it with the threshold, and starts synchronization. The process is as follows:
1.主节点负责主备虚拟机同步的线程(即,同步线程)调用时钟函数得到虚拟机从启动到第一时刻占用CPU的时间CPU_Tick 11. The master node is responsible for the synchronization of the active and standby virtual machines (that is, the synchronization thread) calls the clock function to obtain the CPU_Tick 1 time from the startup to the CPU at the first moment.
2.同步线程睡眠Δt 1。Δt 1是主节点预先设置的值。为了能在较短时间内监测到虚拟机已处于空闲状态同时需避免监测时间过短而产生误差,Δt 1例如可以设定为100微秒。 2. Synchronize thread sleep Δt 1 . Δt 1 is a value set in advance by the master node. In order to be able to detect that the virtual machine is idle in a short time and avoid the error caused by the monitoring time being too short, Δt 1 can be set, for example, to 100 microseconds.
3.责主备虚拟机同步的线程再次调用时钟函数得到虚拟机从启动到第二时刻占用CPU的时间CPU_Tick 23. The thread that is responsible for the synchronization of the active and standby virtual machines calls the clock function again to get the CPU_Tick 2 time taken by the virtual machine from the startup to the second time.
4.若CPU_Tick 1-CPU_Tick 1<c,则说明CPU已空闲,进入第5步,否则同步线程睡眠Δt 1后继续调用时钟函数得到虚拟机从启动到当前时刻占用CPU的时间,直至当前CPU的占用时间减去上次CPU的占用时间的差值小于CPU负载阈值,其中c为触发主备虚拟机同步CPU所需达到的CPU负载阈值。 4. If CPU_Tick 1 -CPU_Tick 1 <c, it means that the CPU is idle, go to step 5, otherwise the synchronous thread sleeps Δt 1 and then continue to call the clock function to get the time that the virtual machine takes up the CPU from the startup to the current time, until the current CPU The difference between the occupied time minus the last CPU usage time is less than the CPU load threshold, where c is the CPU load threshold that is required to trigger the active and standby virtual machines to synchronize the CPU.
5.同步线程通过Linux Netlink接口获取虚拟机从启动到现在占用磁盘I/O的时间disk_time 15. The synchronization thread obtains the time disk_time 1 of the virtual machine from the startup to the current disk I/O through the Linux Netlink interface.
6.同步线程睡眠Δt 2。Δt 2是主节点预先设置的值。Δt 2根据物理磁盘的性能来决定,例如一次磁盘I/O操作需5毫秒,则可以将Δt 2设定为5毫秒。 6. Synchronize thread sleep Δt 2 . Δt 2 is a value set in advance by the master node. Δt 2 is determined according to the performance of the physical disk. For example, if the disk I/O operation takes 5 milliseconds, Δt 2 can be set to 5 milliseconds.
7.同步线程再次通过Linux Netlink接口获取到虚拟机进程的磁盘I/O用时disk_time 27. Synchronize the thread again to get the disk I/O time of the virtual machine process disk_time 2 through the Linux Netlink interface.
8.若disk_time 2-disk_time 1<d,则磁盘I/O已空闲,启动主备同步,否则同步线程睡眠Δt 2后继续通过Linux Netlink接口得到虚拟机从启动到当前时刻占用磁盘I/O的时间,直至当前磁盘I/O的占用时间减去上次磁盘I/O的占用时间的差值(即,当前磁盘I/O负载值)小于磁盘I/O负载阈值,其中d为触发主备虚拟机同步时磁盘所需达到的磁盘负载阈值。 8. If disk_time 2 -disk_time 1 <d, the disk I/O is idle, start the master-slave synchronization, otherwise the synchronization thread sleeps Δt 2 and continues to get the disk I/O from the boot to the current time through the Linux Netlink interface. Time, until the current disk I/O time minus the previous disk I/O time difference (that is, the current disk I/O load value) is less than the disk I/O load threshold, where d is the trigger master and backup The disk load threshold that the disk needs to reach when the virtual machine synchronizes.
上述流程先判断CPU负载是否超过CPU负载阈值,再判断磁盘I/O负载是否超过磁盘I/O负载阈值,作为一个可选的示例,也可以先判断磁盘I/O负载是否超过磁盘I/O负载阈值,再判断CPU负载是否超过CPU负载阈值。此外,如有其它影响主备虚拟机的相同脏页比例的参数,也可以按照上述方法确定是否进行主备虚拟机同步。The above process first determines whether the CPU load exceeds the CPU load threshold, and then determines whether the disk I/O load exceeds the disk I/O load threshold. As an optional example, it can also determine whether the disk I/O load exceeds the disk I/O. Load threshold, and then determine whether the CPU load exceeds the CPU load threshold. In addition, if there are other parameters that affect the same dirty page ratio of the active and standby virtual machines, you can also determine whether to synchronize the active and standby virtual machines according to the above method.
发起同步时,主备同步模块生成一个特殊的数据包描述符,包含一个指向空地址的指针,该数据包描述符还用于指示该数据包的长度(零)以及该数据包的类型。同步模块占据缓冲队列的互斥锁并将数据包描述符插入缓冲队列,随后释放队列互斥锁。主备同步时主虚拟机通过比较主备虚拟机的内存脏页传输不一致的内存脏页到备虚拟机。When the synchronization is initiated, the master-slave synchronization module generates a special packet descriptor containing a pointer to a null address, which is also used to indicate the length (zero) of the packet and the type of the packet. The synchronization module occupies the mutex of the buffer queue and inserts the packet descriptor into the buffer queue, then releases the queue mutex. During the master/slave synchronization, the primary virtual machine transfers inconsistent dirty pages to the standby virtual machine by comparing the dirty pages of the active and standby virtual machines.
应理解,上文所描述的各个实施例仅是本申请的部分实现方式,本申请提供的数据同步处理方法所包括的实施例不限于此,下面,将基于上文所描述的共性特征对本申请提供的数据同步处理方法做进一步介绍。It is to be understood that the various embodiments described above are only partial implementations of the present application. The embodiments included in the data synchronization processing method provided by the present application are not limited thereto. In the following, the present application will be based on the common features described above. The data synchronization processing method provided is further introduced.
图7示出了本申请提供的数据同步处理方法的另一流程图。FIG. 7 shows another flow chart of the data synchronization processing method provided by the present application.
如图7所示,来自客户端的数据包通过主机物理网卡进入主节点后,主节点的终端接入点(terminal access point,TAP)字符串设备(/dev/tapX)变成可读状态。当Qemu主循环线程发现TAP字符串设备可读时,尝试占据全局互斥锁并从该字符串设备读取出客户端数据包。随后Qemu主循环线程生成一个该数据包的描述符,该描述符包括指向该数据 包的指针,以及描述该数据包的长度以及该数据包的类型的信息,其中,该数据包的类型为客户端请求。As shown in FIG. 7, after the data packet from the client enters the master node through the host physical network card, the terminal access point (TAP) string device (/dev/tapX) of the master node becomes readable. When the Qemu main loop thread finds that the TAP string device is readable, it attempts to occupy the global mutex and read out the client packet from the string device. The Qemu main loop thread then generates a descriptor for the packet, the descriptor including a pointer to the packet, and information describing the length of the packet and the type of the packet, wherein the packet type is a client Request.
主节点占据缓冲队列的互斥锁并将数据包描述符填充入缓冲队列,随后释放队列互斥锁。中间层模块负责一致性协商的线程占据缓冲队列的互斥锁,随后查看缓冲队列是否为空。如果缓冲队列非空,则依次读取队列里的成员(即,描述符),将成员所描述的数据包填充入Paxos协议的一致性日志,然后将成员从队列里删除并释放原数据包所占用的内存。负责一致性协商的线程一直读取到队列为空,随后释放队列的互斥锁。释放队列互斥锁后,负责一致性协商的线程查看Paxos协议一致性日志中是否有等待被处理(未协商)的成员,如果有,则对成员按照Paxos算法与其他节点进行协商。The primary node occupies the mutex of the buffer queue and populates the packet descriptor into the buffer queue, then releases the queue mutex. The middle-tier module is responsible for the consistency negotiation thread occupying the mutex of the buffer queue, and then checking whether the buffer queue is empty. If the buffer queue is not empty, the members in the queue (ie, descriptors) are read in turn, the packets described by the members are filled into the consistency log of the Paxos protocol, and then the members are deleted from the queue and the original data packets are released. Occupied memory. The thread responsible for the consistency negotiation reads until the queue is empty, and then releases the mutex of the queue. After the queue mutex is released, the thread responsible for the consistency negotiation checks whether there are members waiting to be processed (not negotiated) in the consistency log of the Paxos protocol. If so, the member negotiates with other nodes according to the Paxos algorithm.
主备同步模块按照图5和图6所示的方法确定进行主备虚拟机同步的时机,当主备同步模块决定触发主备虚拟机同步时,主备同步模块生成一个主备同步请求,并在占据缓冲队列的互斥锁之后将该数据包插入缓冲队列,随后释放队列互斥锁。主备同步请求与来自客户端的数据均需经过一致性协商之后才能被处理。The active/standby synchronization module determines the timing of the synchronization between the active and standby virtual machines according to the methods shown in FIG. 5 and FIG. 6. When the active/standby synchronization module determines to trigger the synchronization of the active and standby virtual machines, the primary and secondary synchronization modules generate an active/standby synchronization request, and After the mutex that occupies the buffer queue, the packet is inserted into the buffer queue, and then the queue mutex is released. Both the active and standby synchronization requests and the data from the client must be negotiated in a consistent manner before they can be processed.
负责一致性协商的线程由Paxos算法确定数据包协商完成后,将该数据包的描述符写入管道,以便于Qemu主循环线程从管道中读取描述符。Qemu主循环线程从管道中读取描述符之后,根据描述符所指示的数据包的类型对数据包进行相应的处理,例如,当数据包为来自客户端的数据包时,将该数据包通过虚拟网卡发送至虚拟机进行处理。The thread responsible for the consistency negotiation is determined by the Paxos algorithm. After the data packet negotiation is completed, the descriptor of the data packet is written to the pipeline, so that the Qemu main loop thread reads the descriptor from the pipeline. After the Qemu main loop thread reads the descriptor from the pipeline, the packet is processed according to the type of the packet indicated by the descriptor. For example, when the packet is a packet from the client, the packet is virtualized. The NIC is sent to the virtual machine for processing.
图8是本申请提供的一种一致性协商方法的示意图。FIG. 8 is a schematic diagram of a consistency negotiation method provided by the present application.
由于Paxos算法要求至少有三个节点,因此,除了主节点和备节点之外,图8所示的分布式系统还包括观察者节点,从而可以满足Paxos算法的要求,该观察者节点也可以替换为备节点。除此之外,适用于本申请的分布式系统还可以包括更多的备节点。Since the Paxos algorithm requires at least three nodes, the distributed system shown in FIG. 8 includes an observer node in addition to the primary node and the standby node, so that the requirements of the Paxos algorithm can be satisfied, and the observer node can also be replaced with Standby node. In addition to this, a distributed system suitable for the present application may also include more standby nodes.
主节点和备节点虚拟机处于热备状态,并行运行相同的分布式数据库程序。观察者节点虚拟机处于待命状态。三个节点的Qemu均部署有一致性协商模块,对客户端网络请求及主备同步请求按照Paxos算法进行协商。观察者节点只参与Paxos协商工作,不参与主备同步。The primary and standby virtual machines are in hot standby and run the same distributed database program in parallel. The observer node virtual machine is in a standby state. The Qemu of the three nodes all have a consistency negotiation module, and the client network request and the master-slave synchronization request are negotiated according to the Paxos algorithm. Observer nodes only participate in Paxos negotiation work and do not participate in active/standby synchronization.
备节点上,中间层软件模块负责一致性协商的线程基于Paxos算法消息传递触发的网络I/O事件。当负责一致性协商的线程收到从其它节点发送的协商消息时,按照Paxos算法进行处理。负责一致性协商的线程确定数据包已完成一致性协商后,如果该数据包是客户端请求,则将该数据包发往虚拟机,如果该数据包是主备同步请求,负责一致性协商的线程通知主备同步模块发起同步。On the standby node, the middle layer software module is responsible for the consistency negotiation thread based on the network I/O event triggered by the Paxos algorithm message delivery. When the thread responsible for the consistency negotiation receives the negotiation message sent from other nodes, it processes according to the Paxos algorithm. The thread responsible for the consistency negotiation determines that after the data packet has been consistently negotiated, if the data packet is a client request, the data packet is sent to the virtual machine, and if the data packet is an active/standby synchronization request, it is responsible for consistency negotiation. The thread notifies the active and standby synchronization modules to initiate synchronization.
观察者节点上,中间层软件模块负责一致性协商的线程也基于Paxos算法消息传递触发的网络I/O事件。当负责一致性协商的线程收到从其它节点发送的协商消息时,按照Paxos算法进行处理。由于观察者节点虚拟机处于待命状态,所以负责一致性协商的线程确定数据包已完成一致性协商后,无论完成协商的数据包是客户端请求还是主备同步请求,均舍弃。On the observer node, the thread responsible for consistency negotiation of the middle layer software module is also based on the network I/O event triggered by the Paxos algorithm message delivery. When the thread responsible for the consistency negotiation receives the negotiation message sent from other nodes, it processes according to the Paxos algorithm. Since the observer node virtual machine is in the standby state, after the thread responsible for the consistency negotiation determines that the data packet has completed the consistency negotiation, the data packet that is negotiated is either the client request or the active/standby synchronization request, and is discarded.
图9是本申请提供的数据同步处理方法的再一流程图。FIG. 9 is still another flowchart of the data synchronization processing method provided by the present application.
S1:主节点读取客户端数据包。S1: The master node reads the client data packet.
当客户端数据包到达主节点的物理网卡后,主节点(即,宿主机操作系统)调用物理网卡的驱动程序,在其中利用Linux内核中的软件网桥,实现数据的转发。在软件网桥这 一层,主节点会判断数据包是发往哪个设备的,同时调用网桥的发送函数,向对应的端口号发送数据包。若数据包是发往虚拟机的,则要通过TAP设备进行转发。TAP等同于一个以太网设备,它操作第二层数据包,即以太网数据帧。TAP设备的字符设备(/dev/tapX)负责将数据包在内核空间和用户空间转发。When the client data packet arrives at the physical network card of the master node, the master node (ie, the host operating system) invokes the driver of the physical network card, in which the software bridge in the Linux kernel is utilized to implement data forwarding. On the software bridge layer, the master node will determine which device the packet is sent to, and at the same time call the bridge's send function to send the packet to the corresponding port number. If the packet is destined for the virtual machine, it is forwarded through the TAP device. The TAP is equivalent to an Ethernet device that operates on Layer 2 packets, the Ethernet data frame. The character device (/dev/tapX) of the TAP device is responsible for forwarding packets in kernel space and user space.
Qemu主循环线程不停地循环,通过“select系统调用”函数判断哪些文件描述符的状态发生变化,包括TAP设备文件描述符和管道设备文件描述的状态。当Qemu主循环线程发现TAP字符串设备可读时,尝试占据全局互斥锁并从字符串设备读取出客户端数据包。随后Qemu主循环生成一个该数据包的描述符,该描述符包含指向该数据包的指针,还包含指示该数据包的长度以及该数据包的类型的信息,此处数据包的类型为客户端数据包。The Qemu main loop thread keeps looping through the "select system call" function to determine which file descriptors have changed state, including the state of the TAP device file descriptor and pipe device file description. When the Qemu main loop thread finds that the TAP string device is readable, it attempts to occupy the global mutex and read out the client packet from the string device. The Qemu main loop then generates a descriptor for the packet, the descriptor containing a pointer to the packet, and information indicating the length of the packet and the type of the packet, where the type of the packet is the client data pack.
S2:主节点发起主备同步生成同步请求数据包。S2: The master node initiates the master-slave synchronization to generate a synchronization request packet.
主节点Qemu的主备同步模块内部署有阈值自动调整算法(如S301~S304所示)。主节点的主备同步模块监测虚拟机的CPU负载和磁盘I/O负载,通过比较负载阈值和虚拟机负载来判断是否启动同步。An automatic threshold adjustment algorithm is deployed in the active/standby synchronization module of the primary node Qemu (as shown in S301 to S304). The active/standby synchronization module of the master node monitors the CPU load and disk I/O load of the virtual machine, and compares the load threshold and the virtual machine load to determine whether to initiate synchronization.
主节点发起主备同步的流程如图4至图6所示。The process in which the master node initiates the master-slave synchronization is shown in Figure 4 to Figure 6.
主节点发起同步时,主备同步模块生成一个特殊的数据包描述符,该描述符包含一个指向空地址的指针,还包含指示该数据包的长度(零)以及该数据包的类型的信息,此处数据包的类型为为主备同步请求。同步模块占据缓冲队列的互斥锁并将数据包描述符填充入缓冲队列,随后释放队列互斥锁。主虚拟机在同步时通过比较主备虚拟机内存脏页,只传输不一致的内存脏页到备虚拟机。When the master node initiates synchronization, the master-slave synchronization module generates a special packet descriptor containing a pointer to the null address, and also information indicating the length (zero) of the packet and the type of the packet. The type of the packet here is the primary and secondary synchronization request. The synchronization module occupies the mutex of the buffer queue and populates the packet descriptor into the buffer queue, then releases the queue mutex. When the primary virtual machine synchronizes, it compares the dirty pages of the active and standby virtual machines, and only transmits the inconsistent dirty pages to the standby virtual machine.
S3:主节点将数据包描述符插入缓冲队列并对数据包进行一致性协商。S3: The master node inserts the packet descriptor into the buffer queue and performs consistency negotiation on the data packet.
将数据包描述符插入缓冲队列并对数据包进行一致性协商流程如图10所示。The process of inserting the packet descriptor into the buffer queue and performing a consistency negotiation on the packet is shown in FIG.
图10包括两个部分,一部分是Qemu主循环线程的处理流程,该处理流程包括3个步骤,分别为占有缓冲队列的互斥锁,将数据包描述符填充入缓冲队列,随后释放队列互斥锁。Figure 10 consists of two parts, one part is the processing flow of the Qemu main loop thread. The processing flow consists of three steps, namely the mutex that holds the buffer queue, fills the packet descriptor into the buffer queue, and then releases the queue mutually exclusive. lock.
另外一部分是一致性协商线程的处理流程。中间层负责一致性协商的线程基于事件(定时器事件或网络I/O事件)驱动。例如,当定时器事件触发时,一致性协商线程先占据缓冲队列的互斥锁,随后查看缓冲队列是否为空。如果缓冲队列非空,一致性协商线程依次读取队列里的成员,将成员所描述的数据包插入Paxos协议的一致性日志,然后将成员从队列里删除并释放原数据包所占用的内存。一致性协商线程一直读取到队列为空,随后释放队列的互斥锁。释放队列互斥锁后,一致性协商线程查看Paxos协议一致性日志中是否有等待被处理(未协商)的成员,如果有,则对待处理的成员按照Paxos算法与其它节点进行协商。The other part is the processing flow of the consistency negotiation thread. The middle-tier thread responsible for consistency negotiation is driven based on events (timer events or network I/O events). For example, when a timer event is triggered, the consistency negotiation thread first occupies the mutex of the buffer queue, and then checks to see if the buffer queue is empty. If the buffer queue is not empty, the consistency negotiation thread reads the members in the queue in turn, inserts the packet described by the member into the consistency log of the Paxos protocol, and then removes the member from the queue and releases the memory occupied by the original packet. The consistency negotiation thread reads until the queue is empty, and then releases the mutex of the queue. After the queue mutex is released, the consistency negotiation thread checks whether there are members waiting to be processed (not negotiated) in the consistency log of the Paxos protocol. If so, the members to be processed negotiate with other nodes according to the Paxos algorithm.
S4:主节点判断协商达成后判断数据包类型。S4: The master node determines the type of the data packet after the negotiation is reached.
一致性协商线程需监听网络I/O事件,该网络I/O事件由接收到的Paxos算法消息触发。当一致性协商线程收到其它节点发送的协商消息时,需按照Paxos算法进行处理。若一致性协商线程由Paxos算法确定某数据包已完成一致性协商,根据该数据包中包含的信息判断其类型,其中,一致性协商线程在对原始数据包(插入缓冲队列前的数据包)进行一致性协商时,会对原始数据包进行封装,封装后得到的数据包除了包含原始数据包,还包含其它信息,该其它信息例如是指示原始数据包类型的信息,一致性协商线程将封装后的数据包发送至备节点。The consistency negotiation thread needs to listen for network I/O events that are triggered by the received Paxos algorithm message. When the consistency negotiation thread receives the negotiation message sent by other nodes, it needs to be processed according to the Paxos algorithm. If the consistency negotiation thread determines that a data packet has been consistently negotiated by the Paxos algorithm, the type is determined according to the information contained in the data packet, wherein the consistency negotiation thread is in the original data packet (the data packet before the buffer queue is inserted) When the consistency negotiation is performed, the original data packet is encapsulated, and the encapsulated data packet contains other information in addition to the original data packet, and the other information is, for example, information indicating the original data packet type, and the consistency negotiation thread encapsulates the information. The subsequent packets are sent to the standby node.
S5:客户端数据包被转发至Qemu主循环,Qemu主循环对客户端数据包进行虚拟网卡(如RTL8139)逻辑操作。S5: The client data packet is forwarded to the Qemu main loop, and the Qemu main loop performs a logical operation of the virtual network card (such as RTL8139) on the client data packet.
如果完成协商的数据包是客户端数据包,则一致性协商线程先在与Qemu主循环联系的管道写入该数据包的长度,然后再写入该数据包内容。If the negotiated packet is a client packet, the consistency negotiation thread first writes the length of the packet in the pipe associated with the Qemu main loop and then writes the packet content.
当Qemu主循环线程发现管道的文件描述符变得可读后,占用全局互斥锁并从管道里先读取出一个整数类型大小的数据,此数据即为管道里发送过来的数据包的长度。根据得到的整数,Qemu主循环线程从管道里读取相应长度的数据,即数据包。When the Qemu main loop thread finds that the file descriptor of the pipeline becomes readable, it takes up the global mutex and reads out an integer type of data from the pipeline. This data is the length of the packet sent in the pipeline. . According to the obtained integer, the Qemu main loop thread reads the corresponding length data, that is, the data packet, from the pipeline.
Qemu主循环线程随后调用RTL8139_do_receiver函数,在这个函数中完成相当于硬件RTL8139网卡的逻辑操作。基于内核的虚拟机(kernel-based virtual machine,KVM)通过模拟I/O指令操作虚拟RTL8139将数据包拷贝到客户地址空间,放在相应的I/O地址。操作完成后,Qemu主循环线程释放全局互斥锁。The Qemu main loop thread then calls the RTL8139_do_receiver function, which performs the logical operation equivalent to the hardware RTL8139 NIC in this function. The kernel-based virtual machine (KVM) operates the virtual RTL8139 by analog I/O instructions to copy the packet to the client address space and place it in the corresponding I/O address. After the operation is complete, the Qemu main loop thread releases the global mutex.
S6:虚拟机中的应用程序处理客户端数据包。S6: The application in the virtual machine processes the client data packet.
若客户端数据包为数据查询请求,则虚拟机中的数据库程序接收到该客户端数据包后执行查询动作,并且返回执行结果。If the client data packet is a data query request, the database program in the virtual machine performs the query action after receiving the client data packet, and returns the execution result.
S7:开始虚拟机主备同步(数据包为同步请求)。S7: Start virtual machine master-slave synchronization (data packet is synchronization request).
如果完成协商的数据包是主备同步请求,则一致性协商线程通知主备同步模块发起同步。产生虚拟机准备同步数据帧,并且将该数据帧放入到主节点的缓冲队列中发送。If the negotiated data packet is an active/standby synchronization request, the consistency negotiation thread notifies the active/standby synchronization module to initiate synchronization. The virtual machine is generated to prepare a synchronization data frame, and the data frame is placed in a buffer queue of the primary node for transmission.
上述实施例描述的系统架构以及业务场景是为了更加清楚的说明本申请的技术方案,并不构成对于本申请的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请提供的技术方案对于类似的技术问题,同样适用。The system architecture and the service scenario described in the foregoing embodiments are for the purpose of more clearly illustrating the technical solutions of the present application, and do not constitute a limitation on the technical solutions of the present application. Those skilled in the art may know that with the evolution of the system architecture and new services. The appearance of the scenario, the technical solution provided by the present application is equally applicable to similar technical problems.
值得注意的是,在本发明实施例中,主备同步模块可通过Qemu的第三线程实现,一致性协商层模块可通过Qemu的第二线程实现,其中第二线程和第三线程均为Qemu的工作者线程。It should be noted that, in the embodiment of the present invention, the master-slave synchronization module can be implemented by the third thread of Qemu, and the consistency negotiation layer module can be implemented by the second thread of Qemu, wherein the second thread and the third thread are both Qemu. Worker thread.
以上结合图1至图10详细说明了本申请提供的数据同步处理的方法。可以理解的是,主节点为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The method of data synchronization processing provided by the present application is described in detail above with reference to FIGS. 1 through 10. It can be understood that, in order to implement the above functions, the master node includes corresponding hardware structures and/or software modules for performing various functions. Those skilled in the art will readily appreciate that the present application can be implemented in a combination of hardware or hardware and computer software in combination with the elements and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
本申请可以根据上述方法示例对主节点进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。The present application may divide a functional unit into a master node according to the above method example. For example, each functional unit may be divided according to each function, or two or more functions may be integrated into one processing unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit. It should be noted that the division of the unit in the present application is schematic, and is only a logical function division, and the actual implementation may have another division manner.
图11示出了本申请提供的一种可能的数据同步处理装置的结构示意图。该数据同步处理装置1100可以是主节点包括的软件模块或硬件模块,该数据同步处理装置1100包括:第一线程控制单元1101和第一线程控制单元1102。第一线程控制单元1101和第一线程控制单元1102用于对数据同步处理装置1100的动作进行控制管理,例如,第一线程控制 单元1101和第一线程控制单元1102用于支持数据同步处理装置1100执行图3的各个步骤和/或用于本文所描述的技术的其它过程。FIG. 11 is a schematic structural diagram of a possible data synchronization processing apparatus provided by the present application. The data synchronization processing device 1100 may be a software module or a hardware module included in the master node, and the data synchronization processing device 1100 includes a first thread control unit 1101 and a first thread control unit 1102. The first thread control unit 1101 and the first thread control unit 1102 are used to control and manage the actions of the data synchronization processing device 1100. For example, the first thread control unit 1101 and the first thread control unit 1102 are configured to support the data synchronization processing device 1100. The various steps of Figure 3 and/or other processes for the techniques described herein are performed.
下面举出几个数据同步处理装置1100的实施例。Several embodiments of the data synchronization processing apparatus 1100 are listed below.
第一线程控制单元1101,用于获取第一待处理信息,第一待处理信息为第一数据包或第一指示信息,第一指示信息用于指示第一数据包,其中,第一线程控制单元1101用于执行非线程安全代码;以及将第一待处理信息写入缓冲模块;The first thread control unit 1101 is configured to acquire the first to-be-processed information, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet, where the first thread control The unit 1101 is configured to execute the non-thread-safe code; and write the first to-be-processed information into the buffer module;
第二线程控制单元1102,用于对第一待处理信息执行一致性协商处理,该一致性协商处理用于同步主节点和备节点处理第一数据包的顺序;The second thread control unit 1102 is configured to perform a consistency negotiation process on the first to-be-processed information, where the consistency negotiation process is used to synchronize the order in which the primary node and the standby node process the first data packet;
第一线程控制单元1101还用于,根据第二线程控制单元1102执行一致性协商处理的结果处理第一数据包。The first thread control unit 1101 is further configured to process the first data packet according to the result of the second thread control unit 1102 performing the consistency negotiation process.
数据同步处理装置1100可以通过第一线程控制单元1101和第二线程控制单元1102执行代码以完成相应的任务。第一线程控制单元1101用于执行非线程安全代码,因此,第一线程控制单元1101在执行操作时需要占据互斥锁,例如,第一线程控制单元1101在获取第一待处理信息之前需要占据全局互斥锁,本申请对第一线程控制单元1101获取第一待处理信息的方式不作限定。第一线程控制单元1101获取第一待处理信息之后,将第一待处理信息写入缓冲模块,该缓冲模块可以是缓冲队列,也可以是用于缓冲第一待处理信息的堆(heap)或栈(stack),还可以是其它用于缓冲第一待处理信息的数据结构,本申请对此不作限定。第一线程控制单元1101将第一待处理信息写入缓冲模块后即可释放全局互斥锁,其它线程可以占据全局互斥锁并调度虚拟机执行其它任务。第二线程控制单元1102读取缓冲模块中的至少一个待处理信息,并基于一致性协商协议确定主备节点处理数据包的共同顺序,随后,第一线程控制单元1101占据全局互斥锁并按照第二线程控制单元1102确定的处理顺序处理数据包。由于主备节点进行一致性协商的工作是由第二线程控制单元1102完成的,第二线程控制单元1102工作时无需占据全局互斥锁,因此,配置有数据同步处理装置1100的主节点在进行主备虚拟机的同步处理时利用主虚拟机处理其它任务,相对于现有技术中的主节点具有较高的性能。The data synchronization processing device 1100 can execute code by the first thread control unit 1101 and the second thread control unit 1102 to complete the corresponding task. The first thread control unit 1101 is configured to execute the non-thread-safe code, and therefore, the first thread control unit 1101 needs to occupy the mutex when performing the operation, for example, the first thread control unit 1101 needs to occupy before acquiring the first to-be-processed information. The global mutex is not limited in this application. The manner in which the first thread control unit 1101 acquires the first to-be-processed information is not limited. After acquiring the first to-be-processed information, the first thread control unit 1101 writes the first to-be-processed information to the buffer module, where the buffer module may be a buffer queue, or may be a heap for buffering the first to-be-processed information or The stack is also a data structure for buffering the first to-be-processed information, which is not limited in this application. After the first thread control unit 1101 writes the first to-be-processed information to the buffer module, the global mutex can be released, and other threads can occupy the global mutex and schedule the virtual machine to perform other tasks. The second thread control unit 1102 reads at least one to-be-processed information in the buffer module, and determines a common order in which the active and standby nodes process the data packets based on the consistency negotiation protocol. Subsequently, the first thread control unit 1101 occupies the global mutex and follows The processing sequence determined by the second thread control unit 1102 processes the data packet. Since the work of the consistency negotiation of the active and standby nodes is performed by the second thread control unit 1102, the second thread control unit 1102 does not need to occupy the global mutex when working, and therefore, the master node configured with the data synchronization processing device 1100 is performing. The synchronization process of the active and standby virtual machines utilizes the primary virtual machine to process other tasks, and has higher performance than the primary nodes in the prior art.
可选地,第二线程控制单元1102具体用于:Optionally, the second thread control unit 1102 is specifically configured to:
从缓冲模块中读取所述第一待处理信息;Reading the first to-be-processed information from the buffer module;
对第一待处理信息执行一致性协商处理,确定第一数据包的被处理顺序;Performing a consistency negotiation process on the first to-be-processed information to determine a processed order of the first data packet;
根据第一数据包的被处理顺序将第一待处理信息写入管道,该管道用于第一线程控制单元1101读取第一待处理信息。The first to-be-processed information is written to the pipeline according to the processed order of the first data packet, and the pipeline is used by the first thread control unit 1101 to read the first to-be-processed information.
第一数据包可以是从客户端获取的数据包,也可以是主节点生成的数据包,还可以是其它数据包,本申请对第一数据包的具体内容不作限定。由于数据同步处理装置1100执行的一些程序代码是非线程安全的,因此,第二线程控制单元1102作为一个工作者线程不能直接调用数据同步处理装置1100的程序代码,本实施例提供的一致性协商处理方案在第一线程控制单元1101和第二线程控制单元1102之间建立一个用于联系的管道,第二线程控制单元1102将一致性协商的结果写入管道,以便于第一线程控制单元1101通过管道读取一致性协商的结果,从而可以在完成一致性协商的同时避免对数据同步处理装置1100的安全性造成影响。The first data packet may be a data packet obtained from the client, or may be a data packet generated by the master node, or may be other data packets. The specific content of the first data packet is not limited in this application. Since some program code executed by the data synchronization processing device 1100 is not thread-safe, the second thread control unit 1102 cannot directly call the program code of the data synchronization processing device 1100 as a worker thread, and the consistency negotiation process provided in this embodiment The scheme establishes a pipe for contacting between the first thread control unit 1101 and the second thread control unit 1102, and the second thread control unit 1102 writes the result of the consistency negotiation to the pipeline so that the first thread control unit 1101 passes The pipeline reads the result of the consistency negotiation, so that the consistency of the data synchronization processing device 1100 can be avoided while completing the consistency negotiation.
可选地,第二线程控制单元1102具体还用于:在预设时间从所述缓冲模块中读取所 述第一待处理信息。Optionally, the second thread control unit 1102 is further configured to: read the first to-be-processed information from the buffer module at a preset time.
在本实施例中,预设时间例如是定时器事件对应的时间,第二线程控制单元1102可以基于定时器事件的触发从缓冲模块中读取第一待处理信息,主节点可以设置不同的定时器事件,因此,上述实施例可以灵活触发第二线程控制单元1102进行一致性协商处理。In this embodiment, the preset time is, for example, the time corresponding to the timer event, and the second thread control unit 1102 can read the first to-be-processed information from the buffer module based on the trigger of the timer event, and the master node can set different timings. The event, therefore, the above embodiment can flexibly trigger the second thread control unit 1102 to perform the consistency negotiation process.
可选地,从缓冲模块中读取第一待处理信息之前,第二线程控制单元1102具体还用于:获取缓冲模块的独占权限,缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问该缓冲模块;Optionally, before the first to-be-processed information is read from the buffer module, the second thread control unit 1102 is further configured to: obtain exclusive rights of the buffer module, and the exclusive permission of the buffer module is used to prohibit two or more The thread accesses the buffer module at the same time;
对第一待处理信息执行一致性协商处理之后,第二线程控制单元1102具体还用于:当缓冲模块中待处理信息的数量为0时,释放第二线程获取的该缓冲模块的独占权限。After performing the consistency negotiation process on the first to-be-processed information, the second thread control unit 1102 is further configured to: when the number of pieces of information to be processed in the buffer module is 0, release the exclusive right of the buffer module acquired by the second thread.
当第二线程控制单元1102开始工作时,首先占据缓冲模块的独占权限,该独占权限也可以称为队列互斥锁,用于禁止两个或两个以上的线程控制单元在同一时刻访问缓冲模块。当缓冲模块中的待处理信息数量为0时第二线程控制单元1102释放队列互斥锁,其它线程可以继续向缓冲模块中写入新的待处理信息。上述实施例可以避免新的待处理信息插入已经完成一致性协商处理的待处理信息队列中,从而提高了一致性协商处理的可靠性和效率。When the second thread control unit 1102 starts to work, it first occupies exclusive rights of the buffer module, which may also be called a queue mutex lock, for prohibiting two or more thread control units from accessing the buffer module at the same time. . When the number of pieces of information to be processed in the buffer module is 0, the second thread control unit 1102 releases the queue mutex, and other threads may continue to write new pending information to the buffer module. The foregoing embodiment can prevent the new pending information from being inserted into the to-be-processed information queue that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
可选地,第二线程控制单元1102具体还用于:Optionally, the second thread control unit 1102 is further specifically configured to:
确定缓冲模块中待处理信息的数量;Determining the amount of information to be processed in the buffer module;
当待处理信息的数量大于0时,将待处理信息对应的数据包写入一致性日志并删除待处理信息,该一致性日志用于缓存待处理信息对应的数据包,一致性日志中的数据包的先后顺序与一致性日志中的数据包的被处理顺序相对应,该待处理信息包括第一待处理信息,该待处理信息对应的数据包包括第一数据包;When the number of the to-be-processed information is greater than 0, the data packet corresponding to the to-be-processed information is written into the consistency log, and the to-be-processed information is deleted. The consistency log is used to cache the data packet corresponding to the to-be-processed information, and the data in the consistency log. The sequence of the packets corresponds to the processed sequence of the data packets in the consistency log, the information to be processed includes the first to-be-processed information, and the data packet corresponding to the to-be-processed information includes the first data packet.
发送包括第一数据包的一致性协商请求,一致性协商请求用于请求备节点接受第一数据包的被处理顺序;Sending a consistency negotiation request including the first data packet, where the consistency negotiation request is used to request the standby node to accept the processed sequence of the first data packet;
接收协商完成消息,该协商完成消息用于指示第一数据包的被处理顺序已被接受。A negotiation completion message is received, the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
第二线程控制单元1102读取待处理信息后执行一致性协商处理,随后删除缓冲模块中的待处理信息,这样,可以保证第二线程控制单元1102每次读取的缓冲模块中的指示信息都是新的待处理信息,避免第二线程控制单元1102读到被处理过的待处理信息,从而提高了一致性协商处理的效率。After the second thread control unit 1102 reads the information to be processed, the consistency negotiation process is executed, and the information to be processed in the buffer module is deleted, so that the instruction information in the buffer module read by the second thread control unit 1102 can be ensured. It is new to-be-processed information, and the second thread control unit 1102 is prevented from reading the processed information to be processed, thereby improving the efficiency of the consistency negotiation process.
可选地,将第一待处理信息写入缓冲模块之前,第一线程控制单元1101还用于:获取缓冲模块的独占权限,缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问该缓冲模块;Optionally, before the first to-be-processed information is written to the buffer module, the first thread control unit 1101 is further configured to: acquire exclusive rights of the buffer module, where the exclusive permission of the buffer module is used to prohibit two or more threads from being Accessing the buffer module at the same time;
将第一待处理信息写入缓冲模块之后,第一线程控制单元1101还用于:释放第一线程控制单元1101获取的缓冲模块的独占权限。After the first to-be-processed information is written to the buffer module, the first thread control unit 1101 is further configured to: release the exclusive permission of the buffer module acquired by the first thread control unit 1101.
第一线程控制单元1101在写入缓冲模块之前,首先占据缓冲模块的独占权限,该独占权限也可以称为队列互斥锁,用于禁止两个或两个以上的线程控制单元在同一时刻访问缓冲模块。当第一线程控制单元1101写入缓冲模块完成后释放队列互斥锁,第二线程控制单元1102可以占据队列互斥锁并读取缓冲模块中的待处理信息。上述实施例可以避免新的待处理信息插入已经完成一致性协商处理的待处理信息的队列中,从而提高了一致性协商处理的可靠性和效率。The first thread control unit 1101 first occupies exclusive rights of the buffer module before writing to the buffer module, and the exclusive authority may also be referred to as a queue mutex lock for prohibiting two or more thread control units from accessing at the same time. Buffer module. When the first thread control unit 1101 releases the queue mutex lock after the write buffer module is completed, the second thread control unit 1102 can occupy the queue mutex lock and read the pending information in the buffer module. The foregoing embodiment can prevent the new pending information from being inserted into the queue of the information to be processed that has completed the consistency negotiation process, thereby improving the reliability and efficiency of the consistency negotiation process.
可选地,第一虚拟设备中运行有主数据库,备节点设置有第二虚拟设备,第二虚拟设备中运行有备用数据库,第一数据包携带有客户端发送至主节点的针对主数据库的访问请求,Optionally, the first virtual device runs a primary database, the standby node is configured with a second virtual device, and the second virtual device runs a standby database, where the first data packet carries the client for sending to the primary node for the primary database. Access request,
第一线程控制单元1101具体还用于:从主节点的物理网卡获取第一待处理信息;将第一数据包同时发送至主数据库和备用数据库,以使得主节点和备节点按照相同的顺序处理第一数据包。The first thread control unit 1101 is further configured to: obtain first to-be-processed information from the physical network card of the primary node; send the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node are processed in the same order. The first packet.
可选地,所述装置还包括第三线程控制单元,第三线程控制单元用于:Optionally, the device further includes a third thread control unit, and the third thread control unit is configured to:
获取n次同步操作时主节点的负载阈值和主节点与备节点的相同脏页比例,n次同步操作时主节点的负载阈值为c 1,…,c n,该n次同步操作时主节点与备节点的相同脏页比例为w 1,…,w n,其中,c 1与w 1对应,…,c n与w n对应,n为大于或等于2的正整数; Obtaining the load threshold of the master node and the same dirty page ratio of the master node and the standby node when n times of synchronization operations, and the load threshold of the master node when n times of synchronization operations are c 1 ,..., c n , the master node of the n synchronization operations dirty pages same proportion standby node is w 1, ..., w n, where, c 1 and w 1 corresponds, ..., c n and w n corresponding to, n is a positive integer equal to or greater than 2;
确定w m,w m为n次同步操作之后的当前时刻的负载阈值,w m=[(c 1×w 1)+...+(c n×w n)]÷n,m为正整数; Determining w m , w m is the load threshold at the current time after n synchronization operations, w m =[(c 1 ×w 1 )+...+(c n ×w n )]÷n, m is a positive integer ;
获取L m,L m为当前时刻主节点的负载值; Obtain L m , L m is the load value of the primary node at the current time;
若L m≤w m,则产生同步请求,该同步请求用于请求同步主节点和备节点的脏页; If L m ≤ w m , a synchronization request is generated, the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node;
将同步请求写入所述缓冲模块;Writing a synchronization request to the buffer module;
第二线程控制单元1102具体还用于:The second thread control unit 1102 is further specifically configured to:
对同步请求执行一致性协商处理,对同步请求执行一致性协商处理的结果用于同步主节点和备节点处理同步请求的顺序;Performing a consistency negotiation process on the synchronization request, and performing a consistency negotiation process on the synchronization request to synchronize the order in which the primary node and the standby node process the synchronization request;
第一线程控制单元1101具体还用于:The first thread control unit 1101 is further specifically configured to:
根据对同步请求执行一致性协商处理的结果处理同步请求。The synchronization request is processed according to the result of performing the consistency negotiation process on the synchronization request.
本实施例提供的数据同步的装置使用的负载阈值为一个动态的较优选的阈值,可以在进行数据同步时使得虚拟机资源利用率与主备节点的相同脏页比例达到一个较佳的平衡点。The load threshold used by the device for data synchronization provided in this embodiment is a dynamic and more preferable threshold, and the virtual machine resource utilization ratio and the same dirty page ratio of the active and standby nodes can reach a better balance point when data synchronization is performed. .
可选地,获取n次同步操作时主节点的负载阈值和主节点与备节点的相同脏页比例之前,第三线程控制单元具体还用于:Optionally, before acquiring the load threshold of the primary node and the same dirty page ratio of the primary node and the standby node, the third thread control unit is further configured to:
获取SUM k,SUM k为主节点的第一次负载测量得到的负载值至第k次负载测量得到的负载值的和,k为正整数。 Obtain SUM k , SUM k is the sum of the load value obtained from the first load measurement of the master node to the load value obtained from the kth load measurement, and k is a positive integer.
当k≥T count时,确定c 0,T count为负载测量次数阈值,c 0为主节点首次同步操作的负载阈值,c 0=SUM k÷k。或者, When k≥T count , it is determined that c 0 , T count is the load measurement threshold, and c 0 is the load threshold of the first synchronization operation of the master node, and c 0 =SUM k ÷k. or,
当k<T count时,获取L k+1,L k+1为第k+1次负载测量得到的主节点的负载值,T count为负载测量次数阈值;获取SUM k+1,SUM k+1=SUM k+L k+1;当k+1≥T count时,确定c 0,c 0为主节点首次同步操作的负载阈值,c 0=SUM k+1÷(k+1)。 When k<T count , obtain L k+1 , L k+1 is the load value of the primary node obtained by the k+1th load measurement, and T count is the threshold of the load measurement times; obtain SUM k+1 , SUM k+ 1 =SUM k +L k+1 ; When k+1≥T count , it is determined that c 0 ,c 0 is the load threshold of the first synchronization operation of the master node, c 0 =SUM k+1 ÷(k+1).
上述实施例可以确定一个初始负载阈值,从而可以确定主节点首次进行数据同步的时机。The above embodiment can determine an initial load threshold so that the timing at which the primary node synchronizes data for the first time can be determined.
可选地,上述主节点的负载值包括处理器负载值和存储器负载值,上述主节点的负载阈值包括处理器负载阈值和存储器负载阈值。Optionally, the load value of the primary node includes a processor load value and a memory load value, and the load threshold of the primary node includes a processor load threshold and a memory load threshold.
在本实施例中,可以先比较处理器负载值与处理器负载阈值的大小关系,再比较存储器负载值与存储器负载阈值的大小关系,也可以先比较存储器负载值与存储器负载阈值的大小关系,再比较处理器负载值与处理器负载阈值的大小关系,从而可以灵活确定主备节 点进行数据同步的时机。In this embodiment, the relationship between the processor load value and the processor load threshold may be compared first, and then the relationship between the memory load value and the memory load threshold may be compared, or the relationship between the memory load value and the memory load threshold may be compared first. Then compare the relationship between the processor load value and the processor load threshold, so that the timing of data synchronization between the active and standby nodes can be flexibly determined.
图12示出了本申请所涉及的主节点的另外一种可能的示意图。FIG. 12 shows another possible schematic diagram of the master node involved in the present application.
参阅图12所示,该主节点1200包括:处理器1202、收发器1203、存储器1201。其中,收发器1203、处理器1202以及存储器1201可以通过内部连接通路相互通信,传递控制和/或数据信号。Referring to FIG. 12, the master node 1200 includes a processor 1202, a transceiver 1203, and a memory 1201. The transceiver 1203, the processor 1202, and the memory 1201 can communicate with each other through an internal connection path to transfer control and/or data signals.
处理单元1102可以是处理器或控制器,例如可以是中央处理器(central processing unit,CPU),通用处理器,数字信号处理器(digital signal processor,DSP),专用集成电路(application-specific integrated circuit,ASIC),现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信单元1103可以是收发器、收发电路等。存储单元1101可以是存储器。The processing unit 1102 can be a processor or a controller, for example, can be a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), and an application-specific integrated circuit. , ASIC), field programmable gate array (FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure. The processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like. The communication unit 1103 can be a transceiver, a transceiver circuit, or the like. The storage unit 1101 may be a memory.
本领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不加赘述。A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and no further details are provided herein.
本申请提供的主节点1200,通过第二线程处理主备节点一致性协商工作,第二线程工作时无需占据全局互斥锁,因此,主节点1200可以在进行主备虚拟机的同步操作时利用虚拟机处理其它任务,提高了主节点的性能。The master node 1200 provided by the present application processes the consistency negotiation of the active and standby nodes by using the second thread, and the second thread does not need to occupy the global mutex when working. Therefore, the master node 1200 can utilize the synchronous operation of the active and standby virtual machines. The virtual machine handles other tasks and improves the performance of the primary node.
装置和方法实施例中的主节点完全对应,由相应的模块执行相应的步骤,例如通信模块方法执行方法实施例中发送或接收的步骤,除发送接收外的其它步骤可以由处理模块或处理器执行。具体模块的功能可以参考相应的方法实施例,不再详述。The master node in the device and the method embodiment corresponds completely, and the corresponding module performs corresponding steps, for example, the communication module method performs the steps of sending or receiving in the method embodiment, and the steps other than sending and receiving may be performed by the processing module or the processor. carried out. For the function of the specific module, reference may be made to the corresponding method embodiment, which is not described in detail.
在本申请各个实施例中,各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请的实施过程构成任何限定。In the various embodiments of the present application, the size of the sequence number of each process does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the present application.
值得注意的是,在本发明实施例中,也可以通过容器实现虚拟机的功能,其中,容器和虚拟机均可称为虚拟设备。It should be noted that, in the embodiment of the present invention, the function of the virtual machine may also be implemented by using a container, where the container and the virtual machine may be referred to as a virtual device.
另外,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。In addition, the term "and/or" herein is merely an association relationship describing an associated object, indicating that there may be three relationships, for example, A and/or B, which may indicate that A exists separately, and A and B exist at the same time. There are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual object is an "or" relationship.
结合本申请公开内容所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read only memory,ROM)、可擦除可编程只读存储器(erasable programmable ROM,EPROM)、电可擦可编程只读存储器(electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、只读光盘(CD-ROM)或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于主节点中。当然,处理器和存储介质也可以作为分立组件存在于主节点中。The steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware or may be implemented by a processor executing software instructions. The software instructions may be composed of corresponding software modules, which may be stored in a random access memory (RAM), a flash memory, a read only memory (ROM), an erasable programmable read only memory ( Erasable programmable ROM (EPROM), electrically erasable programmable read only memory (EEPROM), registers, hard disk, removable hard disk, compact disk read only (CD-ROM) or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium. Of course, the storage medium can also be an integral part of the processor. The processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in the master node. Of course, the processor and the storage medium can also exist as discrete components in the master node.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。 当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者通过所述计算机可读存储介质进行传输。所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,数字通用光盘(digital versatile disc,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in accordance with the present application are generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device. The computer instructions can be stored in or transmitted by a computer readable storage medium. The computer instructions may be from a website site, computer, server or data center via a wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.) Another website site, computer, server, or data center for transmission. The computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes one or more available media. The usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a digital versatile disc (DVD), or a semiconductor medium (eg, a solid state disk (SSD)). Wait.
以上所述的具体实施方式,对本申请的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本申请的具体实施方式而已,并不用于限定本申请的保护范围,凡在本申请的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本申请的保护范围之内。The specific embodiments of the present invention have been described in detail with reference to the specific embodiments of the present application. It is to be understood that the foregoing description is only The scope of protection, any modifications, equivalent substitutions, improvements, etc. made on the basis of the technical solutions of the present application are included in the scope of protection of the present application.

Claims (23)

  1. 一种数据同步处理方法,其特征在于,应用于计算机系统中的主节点的模拟器,所述模拟器用于为所述主节点的第一虚拟设备模拟硬件设备,所述计算机系统还包括与所述主节点连接的备节点,所述方法包括:A data synchronization processing method, characterized by being applied to a simulator of a master node in a computer system, the simulator for simulating a hardware device for a first virtual device of the master node, the computer system further comprising The standby node connected to the master node, where the method includes:
    通过所述模拟器的第一线程获取第一待处理信息,所述第一待处理信息为第一数据包或第一指示信息,所述第一指示信息用于指示所述第一数据包,其中,所述第一线程为执行非线程安全代码的线程;Acquiring, by the first thread of the simulator, the first to-be-processed information, where the first to-be-processed information is the first data packet or the first indication information, where the first indication information is used to indicate the first data packet, Wherein the first thread is a thread that executes a non-thread-safe code;
    通过所述第一线程将所述第一待处理信息写入缓冲模块;Writing, by the first thread, the first to-be-processed information into a buffer module;
    通过所述模拟器的第二线程对所述第一待处理信息执行一致性协商处理,所述一致性协商处理用于同步所述主节点和所述备节点处理所述第一数据包的顺序;And performing, by the second thread of the simulator, a consistency negotiation process on the first to-be-processed information, where the consistency negotiation process is used to synchronize the sequence of processing the first data packet by the primary node and the standby node ;
    通过所述第一线程根据一致性协商处理的结果处理所述第一数据包。The first data packet is processed by the first thread according to a result of the consistency negotiation process.
  2. 根据权利要求1所述的方法,其特征在于,所述通过模拟器的第二线程对所述第一待处理信息执行一致性协商处理,包括:The method according to claim 1, wherein the performing, by the second thread of the simulator, the consistency negotiation process on the first to-be-processed information comprises:
    通过所述第二线程从所述缓冲模块中读取所述第一待处理信息;Reading, by the second thread, the first to-be-processed information from the buffer module;
    通过所述第二线程对所述第一待处理信息执行一致性协商处理,确定所述第一数据包的被处理顺序;Performing a consistency negotiation process on the first to-be-processed information by the second thread, determining a processed sequence of the first data packet;
    根据所述第一数据包的被处理顺序通过所述第二线程将所述第一待处理信息写入管道,所述管道用于所述第一线程读取所述第一待处理信息。And writing, by the second thread, the first to-be-processed information to a pipeline according to the processed sequence of the first data packet, where the pipeline is used by the first thread to read the first to-be-processed information.
  3. 根据权利要求2所述的方法,其特征在于,所述通过所述第二线程从所述缓冲模块中读取所述第一待处理信息,包括:The method according to claim 2, wherein the reading, by the second thread, the first to-be-processed information from the buffer module comprises:
    在预设时间通过所述第二线程从所述缓冲模块中读取所述第一待处理信息。The first to-be-processed information is read from the buffer module by the second thread at a preset time.
  4. 根据权利要求2或3所述的方法,其特征在于,Method according to claim 2 or 3, characterized in that
    所述通过所述第二线程从所述缓冲模块中读取所述第一待处理信息之前,所述方法还包括:Before the reading, by the second thread, the first to-be-processed information from the buffer module, the method further includes:
    通过所述第二线程获取所述缓冲模块的独占权限,所述缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问所述缓冲模块;Acquiring exclusive rights of the buffer module by using the second thread, the exclusive permission of the buffer module is for prohibiting two or more threads from accessing the buffer module at the same time;
    所述通过所述第二线程对所述第一待处理信息执行一致性协商处理之后,所述方法还包括:After the performing the consistency negotiation process on the first to-be-processed information by the second thread, the method further includes:
    当所述缓冲模块中待处理信息的数量为0时,通过所述第二线程释放所述第二线程获取的所述缓冲模块的独占权限。When the number of pieces of information to be processed in the buffer module is 0, the exclusive permission of the buffer module acquired by the second thread is released by the second thread.
  5. 根据权利要求2至4中任一项所述的方法,其特征在于,所述通过所述第二线程对所述第一待处理信息执行一致性协商处理,包括:The method according to any one of claims 2 to 4, wherein the performing the consistency negotiation process on the first to-be-processed information by the second thread comprises:
    通过所述第二线程确定所述缓冲模块中待处理信息的数量;Determining, by the second thread, the quantity of information to be processed in the buffer module;
    当所述待处理信息的数量大于0时,通过所述第二线程将所述待处理信息对应的数据包写入一致性日志并删除所述待处理信息,所述一致性日志用于缓存所述待处理信息对应的数据包,所述一致性日志中的数据包的先后顺序与所述一致性日志中的数据包的被处理顺序相对应,所述待处理信息包括所述第一待处理信息,所述待处理信息对应的数据包包 括所述第一数据包;When the number of the to-be-processed information is greater than 0, the data packet corresponding to the to-be-processed information is written into the consistency log by the second thread, and the to-be-processed information is deleted, where the consistency log is used for the cache a data packet corresponding to the processing information, the sequence of the data packets in the consistency log is corresponding to the processed sequence of the data packets in the consistency log, and the to-be-processed information includes the first to-be-processed Information, the data packet corresponding to the to-be-processed information includes the first data packet;
    通过所述第二线程发送包括所述第一数据包的一致性协商请求,所述一致性协商请求用于请求所述备节点接受所述第一数据包的被处理顺序;And sending, by the second thread, a consistency negotiation request that includes the first data packet, where the consistency negotiation request is used to request the standby node to accept a processed sequence of the first data packet;
    通过所述第二线程接收协商完成消息,所述协商完成消息用于指示所述第一数据包的被处理顺序已被接受。And receiving, by the second thread, a negotiation completion message, where the negotiation completion message is used to indicate that a processed sequence of the first data packet has been accepted.
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,The method according to any one of claims 1 to 5, characterized in that
    所述通过所述第一线程将所述第一待处理信息写入缓冲模块之前,所述方法还包括:Before the first thread writes the first to-be-processed information to the buffer module, the method further includes:
    通过所述第一线程获取所述缓冲模块的独占权限,所述缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问所述缓冲模块;Obtaining the exclusive permission of the buffer module by using the first thread, the exclusive permission of the buffer module is for prohibiting two or more threads from accessing the buffer module at the same time;
    所述通过所述第一线程将所述第一待处理信息写入缓冲模块之后,所述方法还包括:After the first thread writes the first to-be-processed information to the buffer module, the method further includes:
    通过所述第一线程释放所述第一线程获取的所述缓冲模块的独占权限。The exclusive permission of the buffer module acquired by the first thread is released by the first thread.
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述第一虚拟设备中运行有主数据库,所述备节点设置有第二虚拟设备,所述第二虚拟设备中运行有备用数据库,所述第一数据包携带有客户端发送至所述主节点的针对所述主数据库的访问请求,The method according to any one of claims 1 to 6, wherein the first virtual device runs a primary database, the standby node is provided with a second virtual device, and the second virtual device runs Having a standby database, the first data packet carrying an access request for the primary database sent by the client to the primary node,
    所述通过所述模拟器的第一线程获取第一待处理信息,包括:The obtaining, by the first thread of the simulator, the first to-be-processed information includes:
    通过所述第一线程从所述主节点的物理网卡获取所述第一待处理信息;Obtaining, by the first thread, the first to-be-processed information from a physical network card of the primary node;
    所述通过所述第一线程根据一致性协商处理的结果处理所述第一数据包,包括:Processing, by the first thread, the first data packet according to a result of the consistency negotiation process, including:
    通过所述第一线程将所述第一数据包同时发送至所述主数据库和所述备用数据库,以使得所述主节点和所述备节点按照相同的顺序处理所述第一数据包。And transmitting, by the first thread, the first data packet to the primary database and the standby database, so that the primary node and the standby node process the first data packet in the same order.
  8. 根据权利要求1至7中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 7, wherein the method further comprises:
    通过所述模拟器的第三线程获取n次同步操作时所述主节点的负载阈值和所述主节点与所述备节点的相同脏页比例,所述n次同步操作时所述主节点的负载阈值为c 1,…,c n,所述n次同步操作时所述主节点与所述备节点的相同脏页比例为w 1,…,w n,其中,c 1与w 1对应,…,c n与w n对应,n为大于或等于2的正整数; Obtaining, by the third thread of the simulator, the load threshold of the primary node and the same dirty page ratio of the primary node and the standby node when the synchronization operation is performed n times, and the n-th synchronization operation is performed by the primary node load threshold value c 1, ..., c n, the same proportion of dirty pages standby master node and the node w 1 is said synchronous operation n times, ..., W n, wherein, c 1 and w 1 corresponds, ..., c n corresponds to w n , and n is a positive integer greater than or equal to 2;
    通过所述第三线程确定w m,w m为所述n次同步操作之后的当前时刻的负载阈值,w m=[(c 1×w 1)+...+(c n×w n)]÷n,m为正整数; Determining, by the third thread, w m , w m is a load threshold of the current time after the n times of synchronization operations, w m =[(c 1 ×w 1 )+...+(c n ×w n ) ]÷n, m is a positive integer;
    通过所述第三线程获取L m,L m为所述当前时刻所述主节点的负载值; Obtaining, by the third thread, L m , L m is a load value of the primary node at the current moment;
    若L m≤w m,则通过所述第三线程产生同步请求,所述同步请求用于请求同步所述主节点和所述备节点的脏页; If L m ≤ w m , generating a synchronization request by using the third thread, where the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node;
    通过所述第三线程将所述同步请求写入所述缓冲模块;Writing the synchronization request to the buffer module by the third thread;
    通过所述第二线程对所述同步请求执行一致性协商处理,对所述同步请求执行一致性协商处理的结果用于同步所述主节点和所述备节点处理所述同步请求的顺序;And performing, by the second thread, a consistency negotiation process on the synchronization request, and performing a consistency negotiation process on the synchronization request to synchronize an order in which the primary node and the standby node process the synchronization request;
    通过所述第一线程根据对所述同步请求执行一致性协商处理的结果处理所述同步请求。The synchronization request is processed by the first thread according to a result of performing a consistency negotiation process on the synchronization request.
  9. 根据权利要求8所述的方法,其特征在于,所述通过第三线程获取n次同步操作时所述主节点的负载阈值和所述主节点与所述备节点的相同脏页比例之前,所述方法还包括:The method according to claim 8, wherein the third thread obtains the load threshold of the master node and the same dirty page ratio of the master node and the standby node when n times of synchronization operations are acquired. The method also includes:
    通过所述第三线程获取SUM k,SUM k为所述主节点的第一次负载测量得到的负载值至第k次负载测量得到的负载值的和,k为正整数; Obtaining SUM k by the third thread, SUM k is a sum of a load value obtained by the first load measurement of the primary node to a load value obtained by measuring the kth load, and k is a positive integer;
    当k≥T count时,通过所述第三线程确定c 0,T count为负载测量次数阈值,c 0为所述主节点首次同步操作的负载阈值,c 0=SUM k÷k;或者, When k≥T count , the third thread determines c 0 , T count is the load measurement number threshold, and c 0 is the load threshold of the primary node for the first synchronous operation, c 0 =SUM k ÷k;
    当k<T count时,通过所述第三线程获取L k+1,L k+1为第k+1次负载测量得到的所述主节点的负载值,T count为负载测量次数阈值;通过所述第三线程获取SUM k+1,SUM k+1=SUM k+L k+1;当k+1≥T count时,通过所述第三线程确定c 0,c 0为所述主节点首次同步操作的负载阈值,c 0=SUM k+1÷(k+1)。 When k<T count , the third thread acquires L k+1 , L k+1 is the load value of the primary node obtained by the k+1th load measurement, and T count is the threshold of the load measurement times; The third thread acquires SUM k+1 , SUM k+1 =SUM k +L k+1 ; when k+1≥T count , it is determined by the third thread that c 0 , c 0 is the master node The load threshold for the first synchronization operation, c 0 =SUM k+1 ÷(k+1).
  10. 根据权利要求8或9所述的方法,其特征在于,所述主节点的负载值包括处理器负载值存储器,所述主节点的负载阈值包括处理器负载阈值存储器。The method of claim 8 or 9, wherein the load value of the primary node comprises a processor load value memory, and the load threshold of the primary node comprises a processor load threshold memory.
  11. 根据权利要求8或9所述的方法,其特征在于,所述主节点的负载值包括存储器负载值,所述主节点的负载阈值包括存储器负载阈值。The method of claim 8 or 9, wherein the load value of the primary node comprises a memory load value and the load threshold of the primary node comprises a memory load threshold.
  12. 一种数据同步处理装置,其特征在于,应用于计算机系统中的主节点的模拟器,所述模拟器用于为所述主节点的第一虚拟设备模拟硬件设备,所述计算机系统还包括与所述主节点连接的备节点,所述装置包括:A data synchronization processing device, characterized by being applied to a simulator of a master node in a computer system, the simulator for simulating a hardware device for a first virtual device of the master node, the computer system further comprising The standby node connected to the master node, the device includes:
    第一线程控制单元,用于获取第一待处理信息,所述第一待处理信息为第一数据包或第一指示信息,所述第一指示信息用于指示所述第一数据包,其中,所述第一线程控制单元用于执行非线程安全代码;以及将所述第一待处理信息写入缓冲模块;a first thread control unit, configured to acquire first to-be-processed information, where the first to-be-processed information is a first data packet or first indication information, where the first indication information is used to indicate the first data packet, where The first thread control unit is configured to execute the non-thread-safe code; and write the first to-be-processed information into the buffer module;
    第二线程控制单元,用于对所述第一待处理信息执行一致性协商处理,所述一致性协商处理用于同步所述主节点和所述备节点处理所述第一数据包的顺序;a second thread control unit, configured to perform a consistency negotiation process on the first to-be-processed information, where the consistency negotiation process is used to synchronize an order in which the primary node and the standby node process the first data packet;
    所述第一线程控制单元还用于,根据第二线程控制单元执行一致性协商处理的结果处理所述第一数据包。The first thread control unit is further configured to process the first data packet according to a result of performing a consistency negotiation process by the second thread control unit.
  13. 根据权利要求12所述的装置,其特征在于,所述第二线程控制单元具体用于:The device according to claim 12, wherein the second thread control unit is specifically configured to:
    从所述缓冲模块中读取所述第一待处理信息;Reading the first to-be-processed information from the buffer module;
    对所述第一待处理信息执行一致性协商处理,确定所述第一数据包的被处理顺序;Performing a consistency negotiation process on the first to-be-processed information, determining a processed order of the first data packet;
    根据所述第一数据包的被处理顺序将所述第一待处理信息写入管道,所述管道用于所述第一线程控制单元读取所述第一待处理信息。And writing the first to-be-processed information to a pipeline according to a processed order of the first data packet, where the pipeline is used by the first thread control unit to read the first to-be-processed information.
  14. 根据权利要求13所述的装置,其特征在于,所述第二线程控制单元具体还用于:The device according to claim 13, wherein the second thread control unit is further configured to:
    在预设时间从所述缓冲模块中读取所述第一待处理信息。The first to-be-processed information is read from the buffer module at a preset time.
  15. 根据权利要求13或14所述的装置,其特征在于,Device according to claim 13 or 14, characterized in that
    所述从所述缓冲模块中读取所述第一待处理信息之前,所述第二线程控制单元具体还用于:Before the reading the first to-be-processed information from the buffer module, the second thread control unit is further configured to:
    获取所述缓冲模块的独占权限,所述缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问所述缓冲模块;Obtaining exclusive rights of the buffer module, the exclusive permission of the buffer module is for prohibiting two or more threads from accessing the buffer module at the same time;
    所述对所述第一待处理信息执行一致性协商处理之后,所述第二线程控制单元具体还用于:After performing the consistency negotiation process on the first to-be-processed information, the second thread control unit is further configured to:
    当所述缓冲模块中待处理信息的数量为0时,释放所述第二线程获取的所述缓冲模块的独占权限。When the number of pieces of information to be processed in the buffer module is 0, the exclusive right of the buffer module acquired by the second thread is released.
  16. 根据权利要求13至15中任一项所述的装置,其特征在于,所述第二线程控制单元具体还用于:The device according to any one of claims 13 to 15, wherein the second thread control unit is further configured to:
    确定所述缓冲模块中待处理信息的数量;Determining the amount of information to be processed in the buffer module;
    当所述待处理信息的数量大于0时,将所述待处理信息对应的数据包写入一致性日志并删除所述待处理信息,所述一致性日志用于缓存所述待处理信息对应的数据包,所述一致性日志中的数据包的先后顺序与所述一致性日志中的数据包的被处理顺序相对应,所述待处理信息包括所述第一待处理信息,所述待处理信息对应的数据包包括所述第一数据包;When the number of the to-be-processed information is greater than 0, the data packet corresponding to the to-be-processed information is written into the consistency log, and the to-be-processed information is deleted, where the consistency log is used to cache the corresponding information to be processed. a data packet, the sequence of the data packets in the consistency log corresponding to the processed sequence of the data packets in the consistency log, the to-be-processed information including the first to-be-processed information, the to-be-processed The data packet corresponding to the information includes the first data packet;
    发送包括所述第一数据包的一致性协商请求,所述一致性协商请求用于请求所述备节点接受所述第一数据包的被处理顺序;Sending a consistency negotiation request including the first data packet, where the consistency negotiation request is used to request the standby node to accept a processed sequence of the first data packet;
    接收协商完成消息,所述协商完成消息用于指示所述第一数据包的被处理顺序已被接受。Receiving a negotiation completion message, the negotiation completion message is used to indicate that the processed sequence of the first data packet has been accepted.
  17. 根据权利要求12至16中任一项所述的装置,其特征在于,Apparatus according to any one of claims 12 to 16 wherein:
    所述将所述第一待处理信息写入缓冲模块之前,所述第一线程控制单元还用于:Before the first to-be-processed information is written to the buffer module, the first thread control unit is further configured to:
    获取所述缓冲模块的独占权限,所述缓冲模块的独占权限用于禁止两个或两个以上的线程在同一时刻访问所述缓冲模块;Obtaining exclusive rights of the buffer module, the exclusive permission of the buffer module is for prohibiting two or more threads from accessing the buffer module at the same time;
    所述将所述第一待处理信息写入缓冲模块之后,所述第一线程控制单元还用于:After the first to-be-processed information is written to the buffer module, the first thread control unit is further configured to:
    释放所述第一线程控制单元获取的所述缓冲模块的独占权限。Release exclusive rights of the buffer module acquired by the first thread control unit.
  18. 根据权利要求12至17中任一项所述的装置,其特征在于,所述第一虚拟设备中运行有主数据库,所述备节点设置有第二虚拟设备,所述第二虚拟设备中运行有备用数据库,所述第一数据包携带有客户端发送至所述主节点的针对所述主数据库的访问请求,The device according to any one of claims 12 to 17, wherein the first virtual device runs a primary database, the standby node is provided with a second virtual device, and the second virtual device runs Having a standby database, the first data packet carrying an access request for the primary database sent by the client to the primary node,
    所述第一线程控制单元具体还用于:The first thread control unit is further specifically configured to:
    从所述主节点的物理网卡获取所述第一待处理信息;Obtaining the first to-be-processed information from a physical network card of the primary node;
    将所述第一数据包同时发送至所述主数据库和所述备用数据库,以使得所述主节点和所述备节点按照相同的顺序处理所述第一数据包。And transmitting the first data packet to the primary database and the standby database simultaneously, so that the primary node and the standby node process the first data packet in the same order.
  19. 根据权利要求12至18中任一项所述的装置,其特征在于,所述装置还包括第三线程控制单元,Apparatus according to any one of claims 12 to 18, wherein said apparatus further comprises a third thread control unit,
    所述第三线程控制单元用于:The third thread control unit is configured to:
    获取n次同步操作时所述主节点的负载阈值和所述主节点与所述备节点的相同脏页比例,所述n次同步操作时所述主节点的负载阈值为c 1,…,c n,所述n次同步操作时所述主节点与所述备节点的相同脏页比例为w 1,…,w n,其中,c 1与w 1对应,…,c n与w n对应,n为大于或等于2的正整数; Obtaining a load threshold of the primary node and a same dirty page ratio of the primary node and the standby node when the synchronization operation is performed n times, and the load threshold of the primary node is c 1 ,...,c during the n synchronization operations n, said n-th synchronous operation when the same proportion of dirty pages of the standby master node and the node is w 1, ..., W n, wherein, c 1 and w 1 corresponds, ..., C n corresponding to n and W, n is a positive integer greater than or equal to 2;
    确定w m,w m为所述n次同步操作之后的当前时刻的负载阈值,w m=[(c 1×w 1)+...+(c n×w n)]÷n,m为正整数; Determining w m , w m is the load threshold of the current time after the n times of synchronization operations, w m =[(c 1 ×w 1 )+...+(c n ×w n )]÷n,m is Positive integer
    获取L m,L m为所述当前时刻所述主节点的负载值; Obtaining L m , L m is a load value of the primary node at the current moment;
    若L m≤w m,则产生同步请求,所述同步请求用于请求同步所述主节点和所述备节点的脏页; If L m ≤ w m , generating a synchronization request, the synchronization request is used to request synchronization of dirty pages of the primary node and the standby node;
    将所述同步请求写入所述缓冲模块;Writing the synchronization request to the buffer module;
    所述第二线程控制单元具体还用于:The second thread control unit is further specifically configured to:
    对所述同步请求执行一致性协商处理,对所述同步请求执行一致性协商处理的结果用于同步所述主节点和所述备节点处理所述同步请求的顺序;Performing a consistency negotiation process on the synchronization request, and performing a consistency negotiation process on the synchronization request to synchronize the order in which the primary node and the standby node process the synchronization request;
    所述第一线程控制单元具体还用于:The first thread control unit is further specifically configured to:
    根据对所述同步请求执行一致性协商处理的结果处理所述同步请求。The synchronization request is processed according to a result of performing a consistency negotiation process on the synchronization request.
  20. 根据权利要求19所述的装置,其特征在于,所述获取n次同步操作时所述主节点的负载阈值和所述主节点与所述备节点的相同脏页比例之前,所述第三线程控制单元具体还用于:The apparatus according to claim 19, wherein said third thread is obtained before said acquiring a load threshold of said master node and a same dirty page ratio of said master node and said standby node at n times of synchronization operation The control unit is also specifically used to:
    获取SUM k,SUM k为所述主节点的第一次负载测量得到的负载值至第k次负载测量得到的负载值的和,k为正整数; Obtaining SUM k , SUM k is the sum of the load value obtained by the first load measurement of the primary node to the load value obtained by measuring the kth load, and k is a positive integer;
    当k≥T count时,确定c 0,T count为负载测量次数阈值,c 0为所述主节点首次同步操作的负载阈值,c 0=SUM k÷k;或者, When k≥T count , it is determined that c 0 , T count is the load measurement number threshold, and c 0 is the load threshold of the first synchronization operation of the primary node, c 0 =SUM k ÷k;
    当k<T count时,获取L k+1,L k+1为第k+1次负载测量得到的所述主节点的负载值,T count为负载测量次数阈值;获取SUM k+1,SUM k+1=SUM k+L k+1;当k+1≥T count时,确定c 0,c 0为所述主节点首次同步操作的负载阈值,c 0=SUM k+1÷(k+1)。 When k<T count , obtain L k+1 , L k+1 is the load value of the primary node obtained by the k+1th load measurement, and T count is the threshold of the load measurement times; acquire SUM k+1 , SUM K+1 = SUM k + L k+1 ; when k+1≥T count , it is determined that c 0 , c 0 is the load threshold of the first synchronization operation of the primary node, c 0 =SUM k+1 ÷(k+ 1).
  21. 根据权利要求19或20所述的装置,其特征在于,所述主节点的负载值包括处理器负载值存储器,所述主节点的负载阈值包括处理器负载阈值存储器。The apparatus of claim 19 or 20, wherein the load value of the primary node comprises a processor load value memory, and the load threshold of the primary node comprises a processor load threshold memory.
  22. 根据权利要求19或20所述的装置,其特征在于,所述主节点的负载值包括存储器负载值,所述主节点的负载阈值包括存储器负载阈值。The apparatus of claim 19 or 20, wherein the load value of the primary node comprises a memory load value and the load threshold of the primary node comprises a memory load threshold.
  23. 一种数据同步处理装置,其特征在于,包括:处理器,所述处理器与存储器耦合;A data synchronization processing apparatus, comprising: a processor, wherein the processor is coupled to a memory;
    所述存储器用于存储计算机程序;The memory is for storing a computer program;
    所述处理器用于执行所述存储器中存储的计算机程序,以使得所述装置执行如权利要求1-11中任一项所述的方法。The processor is operative to execute a computer program stored in the memory to cause the apparatus to perform the method of any of claims 1-11.
PCT/CN2018/082225 2018-04-08 2018-04-08 Data synchronization processing method and apparatus WO2019195969A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/082225 WO2019195969A1 (en) 2018-04-08 2018-04-08 Data synchronization processing method and apparatus
CN201880004742.8A CN110622478B (en) 2018-04-08 2018-04-08 Method and device for data synchronous processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/082225 WO2019195969A1 (en) 2018-04-08 2018-04-08 Data synchronization processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2019195969A1 true WO2019195969A1 (en) 2019-10-17

Family

ID=68162760

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/082225 WO2019195969A1 (en) 2018-04-08 2018-04-08 Data synchronization processing method and apparatus

Country Status (2)

Country Link
CN (1) CN110622478B (en)
WO (1) WO2019195969A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111352944A (en) * 2020-02-10 2020-06-30 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and storage medium
CN112714185A (en) * 2020-12-30 2021-04-27 威创集团股份有限公司 Access seat system
CN115643237A (en) * 2022-10-13 2023-01-24 北京华建云鼎科技股份公司 Data processing system for conference
US11615084B1 (en) 2018-10-31 2023-03-28 Splunk Inc. Unified data processing across streaming and indexed data sets
US11614923B2 (en) 2020-04-30 2023-03-28 Splunk Inc. Dual textual/graphical programming interfaces for streaming data processing pipelines
US11636116B2 (en) 2021-01-29 2023-04-25 Splunk Inc. User interface for customizing data streams
US11645286B2 (en) 2018-01-31 2023-05-09 Splunk Inc. Dynamic data processor for streaming and batch queries
US11663219B1 (en) 2021-04-23 2023-05-30 Splunk Inc. Determining a set of parameter values for a processing pipeline
US11687487B1 (en) * 2021-03-11 2023-06-27 Splunk Inc. Text files updates to an active processing pipeline
US11727039B2 (en) 2017-09-25 2023-08-15 Splunk Inc. Low-latency streaming analytics
US11886440B1 (en) 2019-07-16 2024-01-30 Splunk Inc. Guided creation interface for streaming data processing pipelines
US11989592B1 (en) 2021-07-30 2024-05-21 Splunk Inc. Workload coordinator for providing state credentials to processing tasks of a data processing pipeline

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767339B (en) * 2020-05-11 2023-06-30 北京奇艺世纪科技有限公司 Data synchronization method and device, electronic equipment and storage medium
CN112954133B (en) * 2021-01-20 2023-03-14 浙江大华技术股份有限公司 Method, device, electronic device and storage medium for synchronizing node time
CN115454657A (en) * 2022-08-12 2022-12-09 科东(广州)软件科技有限公司 Method and device for synchronization and mutual exclusion among tasks of user-mode virtual machine
CN117632799B (en) * 2023-12-05 2024-06-18 合芯科技有限公司 Data processing method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120216193A1 (en) * 2011-02-21 2012-08-23 Samsung Electronics Co., Ltd. Apparatus and method for controlling virtual machine schedule time
CN103309858A (en) * 2012-03-06 2013-09-18 深圳市腾讯计算机系统有限公司 Multi-threaded log management method and multi-threaded log management device
CN103501290A (en) * 2013-09-18 2014-01-08 万达信息股份有限公司 High-reliability service system establishment method based on dynamic-backup virtual machines
CN105224391A (en) * 2015-10-12 2016-01-06 浪潮(北京)电子信息产业有限公司 A kind of online backup method and system of virtual machine
CN105607962A (en) * 2015-10-22 2016-05-25 华为技术有限公司 Method and device for virtual machine backup
CN107729129A (en) * 2017-09-18 2018-02-23 惠州Tcl移动通信有限公司 A kind of multithread processing method based on synchrolock, terminal and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609419B (en) * 2009-06-29 2012-05-30 北京航空航天大学 Continuous on-line transferring data backup method of virtual machine and device thereof
CN102279766B (en) * 2011-08-30 2014-05-07 华为技术有限公司 Method and system for concurrently simulating processors and scheduler
JP5700009B2 (en) * 2012-09-18 2015-04-15 横河電機株式会社 Fault tolerant system
US9740563B2 (en) * 2013-05-24 2017-08-22 International Business Machines Corporation Controlling software processes that are subject to communications restrictions by freezing and thawing a computational process in a virtual machine from writing data
CN104683444B (en) * 2015-01-26 2017-11-17 电子科技大学 A kind of data migration method of data center's multi-dummy machine
CN104915151B (en) * 2015-06-02 2018-12-07 杭州电子科技大学 A kind of memory excess distribution method that active is shared in multi-dummy machine system
CN106168885B (en) * 2016-07-18 2019-09-24 浪潮(北京)电子信息产业有限公司 A kind of method and system of the logical volume dynamic capacity-expanding based on LVM

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120216193A1 (en) * 2011-02-21 2012-08-23 Samsung Electronics Co., Ltd. Apparatus and method for controlling virtual machine schedule time
CN103309858A (en) * 2012-03-06 2013-09-18 深圳市腾讯计算机系统有限公司 Multi-threaded log management method and multi-threaded log management device
CN103501290A (en) * 2013-09-18 2014-01-08 万达信息股份有限公司 High-reliability service system establishment method based on dynamic-backup virtual machines
CN105224391A (en) * 2015-10-12 2016-01-06 浪潮(北京)电子信息产业有限公司 A kind of online backup method and system of virtual machine
CN105607962A (en) * 2015-10-22 2016-05-25 华为技术有限公司 Method and device for virtual machine backup
CN107729129A (en) * 2017-09-18 2018-02-23 惠州Tcl移动通信有限公司 A kind of multithread processing method based on synchrolock, terminal and storage medium

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11727039B2 (en) 2017-09-25 2023-08-15 Splunk Inc. Low-latency streaming analytics
US12105740B2 (en) 2017-09-25 2024-10-01 Splunk Inc. Low-latency streaming analytics
US11645286B2 (en) 2018-01-31 2023-05-09 Splunk Inc. Dynamic data processor for streaming and batch queries
US12013852B1 (en) 2018-10-31 2024-06-18 Splunk Inc. Unified data processing across streaming and indexed data sets
US11615084B1 (en) 2018-10-31 2023-03-28 Splunk Inc. Unified data processing across streaming and indexed data sets
US11886440B1 (en) 2019-07-16 2024-01-30 Splunk Inc. Guided creation interface for streaming data processing pipelines
CN111352944A (en) * 2020-02-10 2020-06-30 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and storage medium
CN111352944B (en) * 2020-02-10 2023-08-18 北京百度网讯科技有限公司 Data processing method, device, electronic equipment and storage medium
US11614923B2 (en) 2020-04-30 2023-03-28 Splunk Inc. Dual textual/graphical programming interfaces for streaming data processing pipelines
CN112714185A (en) * 2020-12-30 2021-04-27 威创集团股份有限公司 Access seat system
US11650995B2 (en) 2021-01-29 2023-05-16 Splunk Inc. User defined data stream for routing data to a data destination based on a data route
US11636116B2 (en) 2021-01-29 2023-04-25 Splunk Inc. User interface for customizing data streams
US11687487B1 (en) * 2021-03-11 2023-06-27 Splunk Inc. Text files updates to an active processing pipeline
US11663219B1 (en) 2021-04-23 2023-05-30 Splunk Inc. Determining a set of parameter values for a processing pipeline
US11989592B1 (en) 2021-07-30 2024-05-21 Splunk Inc. Workload coordinator for providing state credentials to processing tasks of a data processing pipeline
CN115643237B (en) * 2022-10-13 2023-08-11 北京华建云鼎科技股份公司 Data processing system for conference
CN115643237A (en) * 2022-10-13 2023-01-24 北京华建云鼎科技股份公司 Data processing system for conference

Also Published As

Publication number Publication date
CN110622478A (en) 2019-12-27
CN110622478B (en) 2020-11-06

Similar Documents

Publication Publication Date Title
WO2019195969A1 (en) Data synchronization processing method and apparatus
JP5258019B2 (en) A predictive method for managing, logging, or replaying non-deterministic operations within the scope of application process execution
Scales et al. The design of a practical system for fault-tolerant virtual machines
US10411953B2 (en) Virtual machine fault tolerance method, apparatus, and system
WO2017008675A1 (en) Method and device for transmitting data in virtual environment
US9652247B2 (en) Capturing snapshots of offload applications on many-core coprocessors
US9489230B1 (en) Handling of virtual machine migration while performing clustering operations
US8402318B2 (en) Systems and methods for recording and replaying application execution
US8812907B1 (en) Fault tolerant computing systems using checkpoints
JP5519909B2 (en) Non-intrusive method for replaying internal events in an application process and system implementing this method
US20140040206A1 (en) Pipelined data replication for disaster recovery
US20130047157A1 (en) Information processing apparatus and interrupt control method
TWI624757B (en) Data processing method, data processing system, and computer program product
US11537430B1 (en) Wait optimizer for recording an order of first entry into a wait mode by a virtual central processing unit
JP6305976B2 (en) Method, apparatus and system for delaying packets during execution of a network-driven wakeup operation on a computing device
JP2004355233A (en) Fault-tolerant system, program parallel execution method, fault detector for fault-tolerant system, and program
US9940152B2 (en) Methods and systems for integrating a volume shadow copy service (VSS) requester and/or a VSS provider with virtual volumes (VVOLS)
US20160062854A1 (en) Failover system and method
US10540301B2 (en) Virtual host controller for a data processing system
US20140068165A1 (en) Splitting a real-time thread between the user and kernel space
WO2015139327A1 (en) Failover method, apparatus and system
Scales et al. The design and evaluation of a practical system for fault-tolerant virtual machines
US20170235600A1 (en) System and method for running application processes
Zhou et al. Hycor: Fault-tolerant replicated containers based on checkpoint and replay
US11340967B2 (en) High availability events in a layered architecture

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18914205

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18914205

Country of ref document: EP

Kind code of ref document: A1