WO2022227093A1 - Virtualization system and method for maintaining memory consistency in virtualization system - Google Patents

Virtualization system and method for maintaining memory consistency in virtualization system Download PDF

Info

Publication number
WO2022227093A1
WO2022227093A1 PCT/CN2021/091774 CN2021091774W WO2022227093A1 WO 2022227093 A1 WO2022227093 A1 WO 2022227093A1 CN 2021091774 W CN2021091774 W CN 2021091774W WO 2022227093 A1 WO2022227093 A1 WO 2022227093A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
node
management node
broadcast
dvm
Prior art date
Application number
PCT/CN2021/091774
Other languages
French (fr)
Chinese (zh)
Inventor
胡雅琴
李硕
盖辰宁
丁帅
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180090441.3A priority Critical patent/CN116830093A/en
Priority to PCT/CN2021/091774 priority patent/WO2022227093A1/en
Publication of WO2022227093A1 publication Critical patent/WO2022227093A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the present application relates to the field of virtualization technology, and in particular, to a virtualization system and a method for maintaining memory consistency in the virtualization system.
  • the ARM architecture supports a multi-core system, and a multi-core system can be understood as a processor of an electronic device that can support multiple physical cores.
  • FIG. 1 shows a scenario diagram of an ARM architecture-based processor in a virtual machine (virtual machine, VM) application.
  • this virtual machine application scenario can include four levels of domains, domain 0 includes one physical core, domain 1 includes four physical cores, domain 2 includes two domain 1s, and domain 3 includes Domain 2 and other domains (shown in the figure), wherein, domains 1 to 4 can all correspond to their own shared domains (shareable domains), and the shared domains of domains 1 to 4 can be used to store and schedule their respective running needs
  • domains 1 to 4 can all correspond to their own shared domains (shareable domains), and the shared domains of domains 1 to 4 can be used to store and schedule their respective running needs
  • the specific content of the information depends on the actual application scenario, which is not limited here.
  • a virtual machine may correspond to multiple virtual central processing units (vCPUs).
  • vCPUs virtual central processing units
  • virtual machine #1 may correspond to vCPU1-vCPU4
  • virtual machine #2 may correspond to vCPU5-vCPU8.
  • the virtual machine kernel CPU scheduler can schedule one vCPU to run on multiple physical cores, or multiple vCPUs to run on one physical core.
  • Figure 1 shows only one vCPU running on one physical core.
  • An example of a physical core does not limit the physical core on which the vCPU runs.
  • other virtual machines such as VMx, etc., may also be included.
  • the architecture of VMx is the same as that of virtual machine #1 or virtual machine #2. similar, and will not be repeated here.
  • the virtual machine application scenario also includes a storage (memory) system and other nodes (miscellaneous node, MN).
  • the memory system is used to store memory page tables, etc.
  • the memory page table can also be called a page table (page table), and the memory page table can store the mapping from virtual address (virtual address, VA) to physical address (physical address, PA), Among them, VA is unique in the access process, and PA is unique in hardware.
  • each physical core first obtains the memory page table related to each physical core from the memory system, and each physical core stores the mapping of the obtained memory page table in its own translation lookaside buffer (translation lookaside buffer, TLB), when the physical core needs to obtain the mapping between VA and PA, the physical core first searches from its own TLB. If the physical core cannot search for the required mapping in its own TLB, the physical core can go to the memory system. search.
  • TLB translation lookaside buffer
  • the MN is used to manage multiple physical cores.
  • the MN can receive distributed virtual memory (DVM) requests from a certain physical core, and the DVM requests can be used to maintain memory consistency in virtual machine application scenarios.
  • the MN may send a snoop (snp) DVM request to other physical cores, and the snpDVM request is used to instruct other physical cores to maintain the consistency of their respective TLBs.
  • the MN receives response (Resp) information from other physical cores, and the MN returns Resp information to the physical core that initiated the DVM request.
  • each physical core maintains TLB consistency can be understood as, each physical core invalidates a modified entry in the TLB, and subsequently, if the physical core needs to obtain the content of the invalid entry, the physical core obtains it from the memory system.
  • the virtual machine application scenario of FIG. 1 can be applied to a certain electronic device, and the physical core, memory system and MN shown in FIG. 1 can all be included in the electronic device.
  • the virtual machine application scenario shown in FIG. 1 can also be applied to a distributed system.
  • the distributed system may include multiple electronic devices, and the physical core, memory system, and MN shown in FIG. 1 may be distributed and set in different electronic devices.
  • the MN may also be referred to as a management node, and the MN may be a physical core used to implement a management function, or multiple physical cores used to implement the management function. limited.
  • an inter-processor interrupt (IPI) method may be used.
  • IPI inter-processor interrupt
  • a physical core modifies the entry information in the memory page table in the shared memory, in addition to the fact that the physical core needs to complete the invalidation operation of the entry information in its own TLB, the physical core also needs to send the MN to other physical cores.
  • IPI instruction includes information indicating the maintenance of TLB consistency.
  • Other physical cores invalidate their respective TLB entry information in the IPI interrupt processing function to achieve the purpose of maintaining shared memory page table consistency.
  • the IPI method is used in the above possible implementation.
  • the physical core When the physical core receives the IPI request, it will suspend the currently executing task, respond to the interrupt, execute the page table synchronization and then return to the original program. This process will affect the physical core. performance.
  • the embodiments of the present application provide a virtualization system and a method for maintaining memory consistency in the virtualization system.
  • the broadcast range of the MN can be limited, the time for the MN to wait for an invalid operation of the physical core is reduced, and the physical core does not need to Interrupt tasks without affecting the performance of physical cores.
  • an embodiment of the present application provides a virtualization system, including a request node and a first management node; the request node is configured to send a distributed virtual system to the first management node when performing memory consistency maintenance of the virtualization system.
  • Memory DVM request the DVM request includes broadcast range information; the first management node is used to parse the DVM request to obtain the broadcast range information, and the broadcast range information indicates the information of M target nodes; M is a positive integer; It is used to send broadcast information to each of the M target nodes; the broadcast information is used to instruct each target node to perform memory consistency maintenance.
  • the embodiments of the present application can limit the broadcast range of the first management node, reduce the time that the first management node waits for the invalid operation of the physical core, and the physical core does not need to interrupt the task, Does not affect the performance of physical cores.
  • the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or an identifier of a layer related to the virtual machine running in the requesting node; wherein the layers correspond to multiple physical core.
  • the physical core can be used as the granularity to indicate the broadcast range to achieve precise broadcast range limitation, and the broadcast range can also be indicated by the layer as the granularity.
  • the identifier of one layer can correspond to multiple Therefore, the indication of multiple physical cores can be realized through fewer identifiers, which saves system resources.
  • the first management node is specifically configured to send broadcast information to each of the M target nodes when the identifier of the M physical cores indicated by the broadcast range information is parsed; or, The first management node is specifically configured to send broadcast information to each of the M target nodes in the layer when the identifier of the layer indicated by the broadcast range information is parsed.
  • the broadcast range of the first management node can be limited, and the time for the first management node to wait for the invalid operation of the physical core can be reduced.
  • the nodes indicated by the broadcast range information include nodes managed by the second management node; the first management node is further configured to send a DVM request to the second management node; the second management node is configured to The DVM request is parsed to obtain broadcast range information, and broadcast information is sent to nodes managed by the second management node.
  • the DVM request is parsed to obtain broadcast range information, and broadcast information is sent to nodes managed by the second management node.
  • the layers indicated by the broadcast range information include a layer managed by the second management node; the first management node is further configured to send a DVM request to the second management node; the second management node is configured to The DVM request is parsed to obtain the broadcast range information, and the broadcast information is sent to the nodes in the layer managed by the second management node.
  • the second management node sends the broadcast information to the nodes in the layer managed by the second management node, the indication to multiple nodes can be implemented by using fewer identifiers, thereby saving system resources.
  • the requesting node is provided with a register; the register is used to store broadcast range information; the requesting node is specifically used to generate an instruction when performing memory consistency maintenance of the virtualization system, and combine the instruction with the broadcast range
  • the information is packaged as a DVM request, and the DVM request is sent to the first management node.
  • the broadcast range can be updated by definition, and the requesting node can package the instruction and the broadcast range information into a DVM request to send to the first management node, which is convenient and quick.
  • the instruction includes a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction.
  • the translation lookaside buffer instruction TLBI can maintain the consistency of the TLB in the physical core
  • the cache maintenance instruction IC instruction can also maintain the consistency of the TLB in the physical core.
  • the first management node is further configured to collect M response information from M target nodes; the response information of each target node is used to indicate that the target node completes memory consistency maintenance; the first management node , and is also used to send information indicating the completion of memory consistency maintenance to the requesting node. In this way, the first management node can collect and transmit information indicating that the memory consistency maintenance is completed.
  • the requesting node is further configured to generate a data synchronization isolation DSB instruction when multiple DVM requests are sent within a preset time, and the DSB instruction is used to instruct the first management node to collect and complete multiple DVM requests After receiving the response information from the corresponding node, the information for indicating the completion of the memory consistency maintenance is synchronously sent to the first management node. In this way, the requesting node can generate a DSB instruction, indicating that the maintenance of the memory consistency of the nodes corresponding to the multiple DVM requests is completed.
  • the first management node is specifically configured to send broadcast information to each of the M target nodes in a covering manner; wherein the covering manner is to cover nodes other than the M target nodes The way.
  • the covering manner is to cover nodes other than the M target nodes The way.
  • an embodiment of the present application provides a method for maintaining memory consistency in a virtualization system, including: when a requesting node performs memory consistency maintenance in a virtualization system, sending a distributed virtual memory DVM request to a first management node,
  • the DVM request includes broadcast range information;
  • the first management node parses the DVM request to obtain broadcast range information, and the broadcast range information indicates the information of M target nodes; M is a positive integer;
  • the first management node sends a message to each of the M target nodes.
  • Send broadcast information broadcast information is used to instruct each target node to maintain memory consistency.
  • the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or an identifier of a layer related to the virtual machine running in the requesting node; wherein the layers correspond to multiple physical core.
  • the first management node sends broadcast information to each of the M target nodes, including: when the first management node parses the identifiers of the M physical cores indicated by the broadcast range information, Send broadcast information to each of the M target nodes; or, when the first management node parses the identifier of the layer indicated by the broadcast range information, sends broadcast information to each of the M target nodes in the layer. .
  • the nodes indicated by the broadcast range information include nodes managed by the second management node, and/or layers managed by the second management node, and the method further includes: the first management node sends the second management node to the second management node.
  • the management node sends a DVM request.
  • the requesting node sends a distributed virtual memory DVM request to the first management node when performing the memory consistency maintenance of the virtualization system, including: the requesting node is performing the memory consistency maintenance of the virtualization system When , an instruction is generated; the requesting node packages the instruction and the broadcast range information into a DVM request; the requesting node sends the DVM request to the first management node.
  • the instruction includes a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction.
  • the first management node collects M pieces of response information from M target nodes; the response information of each target node is used to indicate that the target node has completed memory consistency maintenance; the first management node sends a message to the requesting node Information used to indicate that memory consistency maintenance is complete.
  • a data synchronization isolation DSB instruction is generated, and the DSB instruction is used to instruct the first management node to collect and complete the data of the nodes corresponding to the multiple DVM requests. After responding to the information, synchronously sends to the first management node information indicating that the memory consistency maintenance is completed.
  • the first management node sends broadcast information to each of the M target nodes, including: the first management node sends broadcast information to each of the M target nodes in a covering manner;
  • the covering method is a method of covering nodes other than the M target nodes.
  • an embodiment of the present application provides a method for maintaining memory consistency in a virtualization system, including: a first management node receives a distributed virtual memory DVM request from a requesting node, where the DVM request includes broadcast range information; the first management node Parse the DVM request to obtain broadcast range information, the broadcast range information indicates the information of M target nodes; M is a positive integer; the first management node sends broadcast information to each of the M target nodes; the broadcast information is used to indicate each The target node performs memory consistency maintenance.
  • the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or an identifier of a layer related to the virtual machine running in the requesting node; wherein the layers correspond to multiple physical core.
  • the first management node sends broadcast information to each of the M target nodes, including: when the first management node parses the identifiers of the M physical cores indicated by the broadcast range information, Send broadcast information to each of the M target nodes; or, when the first management node parses the identifier of the layer indicated by the broadcast range information, sends broadcast information to each of the M target nodes in the layer. .
  • the nodes indicated by the broadcast range information include nodes managed by the second management node, and/or layers managed by the second management node, and the method further includes: the first management node sends the second management node to the second management node.
  • the management node sends a DVM request.
  • the first management node collects M pieces of response information from M target nodes; the response information of each target node is used to indicate that the target node has completed memory consistency maintenance; the first management node sends a message to the requesting node Information used to indicate that memory consistency maintenance is complete.
  • the first management node sends broadcast information to each of the M target nodes, including: the first management node sends broadcast information to each of the M target nodes in a covering manner;
  • the covering method is a method of covering nodes other than the M target nodes.
  • an embodiment of the present application provides a method for maintaining memory consistency in a virtualized system, including: a requesting node generates an instruction when performing memory consistency maintenance in a virtualization system; the requesting node sets the instruction sum in the requesting node The broadcast range information is packaged into a DVM request; the requesting node sends the DVM request to the first management node.
  • the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or an identifier of a layer related to the virtual machine running in the requesting node; wherein the layers correspond to multiple physical core.
  • the instruction includes a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction.
  • the requesting node receives information from the first management node indicating that the maintenance of memory consistency is completed.
  • a data synchronization isolation DSB instruction is generated, and the DSB instruction is used to instruct the first management node to collect and complete the data of the nodes corresponding to the multiple DVM requests. After responding to the information, synchronously sends to the first management node information indicating that the memory consistency maintenance is completed.
  • embodiments of the present application provide a computer-readable storage medium, where the computer-readable storage medium includes a computer program, and when the computer program is run on an electronic device, the electronic device is made to perform any of the above-mentioned third and fourth aspects.
  • the technical solution includes a computer program, and when the computer program is run on an electronic device, the electronic device is made to perform any of the above-mentioned third and fourth aspects.
  • a sixth aspect is a computer program product according to an embodiment of the present application.
  • the computer program product includes instructions, and when the instructions are executed on a computer, the computer can execute the technical solutions of any of the third and fourth aspects.
  • FIG. 1 is a schematic diagram of a scenario of an ARM architecture-based processor in a virtual machine application provided by an embodiment of the present application;
  • FIG. 2 is a schematic diagram of a first system architecture to which the method of the embodiment of the present application is applied;
  • FIG. 3 is a schematic diagram of a second system architecture to which the method of the embodiment of the present application is applied;
  • FIG. 4 is a schematic diagram of a third system architecture to which the method of the embodiment of the present application is applied;
  • FIG. 5 is a schematic diagram of a fourth system architecture to which the method of the embodiment of the present application is applied;
  • FIG. 6 is a schematic flowchart of data synchronization in a multi-core system provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a logical architecture of an MN node parsing a DVM request according to an embodiment of the present application
  • FIG. 8 is a schematic flowchart of a specific data synchronization method in a multi-core system according to an embodiment of the present application.
  • words such as “first” and “second” are used to distinguish the same or similar items with basically the same function and effect.
  • the first event and the second event are only for distinguishing different events, and do not limit their order.
  • the words “first”, “second” and the like do not limit the quantity and execution order, and the words “first”, “second” and the like are not necessarily different.
  • At least one means one or more
  • plural means two or more.
  • And/or which describes the association relationship of the associated objects, indicates that there can be three kinds of relationships, for example, A and/or B, which can indicate: the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • At least one (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
  • Physical core It can be manufactured from single crystal silicon with a certain production process, and is used to perform steps such as calculation, receiving or storing commands, and processing data. Each physical core has its own independent TLB.
  • TLB maintenance instruction an instruction used to instruct to maintain the consistency of the TLB in the physical core.
  • Virtual machine refers to a complete computer system with complete hardware system functions simulated by software and running in an isolated environment.
  • Bus Refers to the public communication trunk for transmitting data between various nodes, which can be used to transmit messages or requests.
  • Broadcast range register refers to the register used to store broadcast range information.
  • a physical core can correspond to an RN, or it can be understood that an RN is a node that runs a vCPU in the physical core to achieve a certain function. It should be noted that RN and MN are relative concepts, one MN can manage multiple RNs, and the multiple RNs can correspond to one virtual machine or multiple virtual machines.
  • the methods of the embodiments of the present application can be applied to virtual machine application scenarios in the fields of embedded, consumer electronics, big data, automotive electronics, mass storage, imaging equipment, industrial control, security systems, or cloud computing.
  • the electronic device that executes the method of the embodiment of the present application includes a processor, and the processor may adopt an ARM architecture, and the processor based on the ARM architecture has the advantages of high speed, low power consumption, and low price.
  • the electronic device may also be referred to as a terminal device, a terminal (terminal), a user equipment (UE), a mobile station (mobile station, MS), or a mobile terminal (mobile terminal, MT).
  • the electronic device can be a mobile phone (mobile phone), a smart TV, a wearable device, a tablet computer (Pad), a computer with a wireless transceiver function, a virtual reality (virtual reality, VR) terminal device, an augmented reality (augmented reality, AR) terminal Equipment, wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in remote medical surgery, wireless terminals in smart grid, transportation Wireless terminals in security (transportation safety), wireless terminals in smart cities, wireless terminals in smart homes, and so on.
  • the embodiments of the present application do not limit the specific technology and specific device form adopted by the electronic device.
  • the way of broadcasting invalid operation of TLB can be adopted.
  • a physical core modifies one or more mapping entries in the memory page table in the shared memory
  • the physical core generates a TLBI.
  • the physical core also Package the TLBI as a DVM request, and send the DVM request to the MN through the coherent hub interface (CHI).
  • CHI coherent hub interface
  • the ARM software will open a force broadcast (FB) control instruction by default.
  • FB force broadcast
  • this instruction will force the MN to broadcast the DVM request to all physical cores within the inner shareable (IS) range, and the MN broadcasts the snpDVM request to all the physical cores within the IS range.
  • the request invalidates the corresponding entry information in the respective TLB.
  • the MN collects the Resp information of all physical cores within the IS range, it indicates the completion of the TLB maintenance to the physical core that sent the DVM request, so as to maintain the TLB consistency of each physical core. Purpose.
  • the IS range is defined at the beginning of the system design, and the IS range is large and cannot be changed.
  • the MN can perform a DVM synchronization (synchronization, Sync) operation, so that the corresponding entry information in the completion TLB is invalid for all physical cores within the IS scope. After that, the Resp information is returned to the MN at the same time.
  • DVM synchronization synchronization, Sync
  • the MN needs to indiscriminately wait for the corresponding entry information invalidation operation in the TLBs of all physical cores within the IS range to end, and there is a long waiting time and a large performance overhead.
  • embodiments of the present application provide a virtualization system and a method for maintaining memory consistency in the virtualization system.
  • the broadcast range of the MN can be limited, the time for the MN to wait for an invalid operation of a physical core can be reduced, and The physical core does not need to interrupt tasks and does not affect the performance of the physical core.
  • the RN in the embodiment of the present application sends a request to the MN, the request includes broadcast range information, that is, the broadcast range of the MN is reduced.
  • the MN After the MN sends the broadcast information to the node corresponding to the broadcast range information, it waits for each The time adaptation of node synchronization is shortened, the efficiency of maintaining consistency is improved, and each node does not need to interrupt its own tasks, which does not affect the performance of each node.
  • a broadcast range register may be set in the physical core of the embodiment of the present application, and broadcast range information is preset in the broadcast range register.
  • the broadcast range register can receive user settings. For example, the user can set or modify the broadcast range information in the broadcast range register by using CPU scheduler software in advance.
  • the MN is configured with logic capable of parsing the broadcast range information requested by the DVM.
  • a class Affinity register may be set in the MN, and the Affinity register may be set to indicate the RN corresponding to the MN and the system level (ie, layer) where the RN is located.
  • the Affinity register is statically configurable, for example, it can be configured when software deploys a VM.
  • the broadcast range information in this embodiment of the present application indicates the broadcast range
  • the broadcast range may be indicated with a physical core as the granularity.
  • the broadcast range information may include the identifier of the physical core that needs to be broadcast. Broadcast to the physical core indicated by the broadcast range information.
  • a broadcast range register with a range of 128 bits (binary digit, Bit) can be defined, and each Bit represents a physical core.
  • the physical The core indicates the broadcast range for granularity, and the MN can broadcast the physical core according to the indication of the broadcast range information. It can be understood that, in this implementation, because the broadcast range information is based on the physical core as the granularity, accurate broadcast range limitation can be achieved.
  • the broadcast range information in this embodiment of the present application indicates the broadcast range
  • the broadcast range may also be indicated by a layer as a granularity.
  • One layer may include multiple physical cores, and the layer division rules are not limited in this embodiment of the present application.
  • the broadcast range information can include the identifier of the layer that needs to be broadcast, and when the MN broadcasts, it broadcasts all physical cores in the layer indicated by the broadcast range information. It can be understood that, in this implementation, because the broadcast range information is granular in layers, the identifier of one layer can correspond to multiple physical cores, so the indication of multiple physical cores can be implemented with fewer identifiers, saving system resources.
  • the number of MNs in the embodiment of the present application may be one.
  • the RN can send the broadcast range information to one of the MNs.
  • the MN can also send the broadcast range information to other MNs. MN, so that the other MN can also send broadcast information to the node managed by the other MN.
  • FIG. 2 to FIG. 5 show schematic diagrams of four possible virtualization system architectures according to the embodiments of the present application.
  • FIG. 2 shows a schematic diagram of a first virtualization system architecture to which the method of the embodiment of the present application is applied.
  • the broadcast range information indicates the broadcast range
  • the broadcast range is indicated with the physical core as the granularity
  • the nodes indicated in the broadcast range information are all managed by one MN.
  • the virtualization system includes RN-0, MN, and RN-1 to RN-n.
  • RN-0 corresponds to physical core #1
  • TLB and broadcast range register are set in physical core #1
  • broadcast range information is preset in the broadcast range register.
  • the broadcast range information may be stored in the broadcast range register through the virtual machine core CPU scheduler in advance, where the broadcast range information may include the identifier of the physical core related to the VM running in the RN-0. It can be understood that, when the broadcast range information in the broadcast register needs to be updated, the update code can be scheduled by the virtual machine core CPU scheduler to update the broadcast register.
  • the process of maintaining memory consistency in a virtualization system is described below by taking RN-0 modifying the entry information of the shared page table and initiating maintaining the consistency of entry information in the TLB of each physical core as an example.
  • RN-0 modifies the entry information of the shared page table
  • TLBI is generated in RN-0, and the TLBI is used to maintain the TLB in physical core #1.
  • RN-0 completes the invalidation operation of entry information in its own TLB, and further Yes, RN-0 can also obtain the updated memory page table from the memory system, set it in the TLB of RN-0, and complete the update of the shared page table of RN-0.
  • RN-0 can package the broadcast range information and TLBI into a DVM request, then the DVM request includes the broadcast range information, RN-0 sends the DVM request to the MN through the CHI bus, and the MN parses the broadcast range information to obtain the broadcast range
  • the physical core indicated by the information, and the physical core indicated by the broadcast range information may also be referred to as other RNs.
  • the physical core indicated by the broadcast range information may be RN-1, and the MN broadcasts a snpDVM request to RN-1. It can be understood that the number of physical cores indicated by the broadcast range information may be determined according to an actual scenario, and may be one or more, which is not specifically limited in this embodiment of the present application.
  • RN-1 can invalidate the entry information in its own TLB based on the snpDVM request, and RN-1 can also further obtain the updated memory page table from the memory system, and send Resp information to the MN.
  • RN-0 may modify multiple entry information in the memory page table within a period of time, then RN-0 may generate multiple TLBIs asynchronously within this period of time. If applicable, RN -0 Asynchronously sends the DVM requests corresponding to the multiple TLBIs to the MN.
  • RN-0 can generate data synchronization
  • the isolation (data synchronization barrier, DSB) instruction instructs the MN to collect the Resp information of the nodes corresponding to the multiple DVM requests, and then synchronously sends the Resp information of all nodes to the RN-0, indicating the memory consistency of the nodes corresponding to the multiple DVM requests Maintenance is complete.
  • FIG. 3 shows a schematic diagram of a second virtualization system architecture to which the method of this embodiment of the present application is applied.
  • the physical core is used as the granularity to indicate the broadcast range, and the nodes indicated in the broadcast range information are managed by multiple different MNs.
  • the virtualization system includes RN-0, MN1, MN2, RN1-1 to RN1-n, and RN2-1 to RN2-n.
  • MN1 is used to manage RN1-1 to RN1-n
  • MN2 is used to manage RN2-1 to RN2-n.
  • RN-0 corresponds to physical core #1
  • TLB and broadcast range register are set in physical core #1
  • broadcast range information is preset in the broadcast range register.
  • the process of maintaining memory consistency in a virtualization system is described below by taking RN-0 modifying the entry information of the shared page table and initiating maintaining the consistency of entry information in the TLB of each physical core as an example.
  • the nodes indicated in the broadcast range information are managed by multiple different MNs.
  • the physical cores indicated in the broadcast range information include RN1-1, RN1 -2, RN2-1 and RN2-2, RN1-1 and RN1-2 are managed by MN1, and RN2-1 and RN2-2 are managed by MN2.
  • RN-0 sends a DVM request to MN1
  • MN1 parses the broadcast range information, and determines that the broadcast range information includes not only RN1-1 and RN1-2 corresponding to MN1, but also RN2-1 corresponding to MN2 and RN2-2.
  • MN1 broadcasts the snpDVM request to RN1-1 and RN1-2. At the same time, MN1 sends the DVM request to MN2. MN2 parses the broadcast range information and determines that the physical cores indicated in the broadcast range information include RN2-1 and RN2-2. RN2-1 and RN2-2 send snpDVM requests.
  • RN1-1 and RN1-2 may perform the entry information invalidation operation in the TLB and send Resp information to MN1 in the manner described in the embodiment corresponding to FIG. 2 .
  • RN2-1 and RN2-2 may perform the entry information invalidation operation in the TLB and send Resp information to MN2 in the manner described in the embodiment corresponding to FIG. 2 .
  • MN2 After MN2 collects the Resp information of RN2-1 and RN2-2, it can send the Resp information to MN1, and MN1 collects the Resp information of RN1-1, RN1-2 and MN2, and sends the Resp information to RN-0.
  • FIG. 2 shows the situation that the physical core indicated by the broadcast range information corresponds to two MNs.
  • the number of MNs may be greater than 2, and each MN can transmit DVM requests to each other.
  • the RN managed by each MN indicated in the range information broadcasts and collects reply information.
  • the embodiment of the present application does not limit the number of MNs and the communication mode between multiple MNs.
  • FIG. 4 shows a schematic diagram of a third virtualization system architecture to which the method of this embodiment of the present application is applied.
  • the broadcast range information indicates the broadcast range
  • the broadcast range is indicated by layers as granularity
  • the nodes in the layers indicated in the broadcast range information are all managed by one MN.
  • the virtualization system includes RN-0, MN0, RN0-1 to RN0-n.
  • RN0-1 to RN0-n belong to one layer 0, and MN0 is used to manage RN0-1 to RN0-n.
  • RN-0 corresponds to physical core #1
  • TLB and broadcast range register are set in physical core #1
  • broadcast range information is preset in the broadcast range register.
  • the process of maintaining memory consistency in a virtualization system is described below by taking RN-0 modifying the entry information of the shared page table and initiating maintaining the consistency of entry information in the TLB of each physical core as an example.
  • the broadcast range information indicates the broadcast range
  • the broadcast range is indicated by the granularity of layers.
  • layer 0 is indicated in the broadcast range information, and layer 0 includes RNO-1 to RNO-n.
  • RN-0 sends a DVM request to MN0
  • MN0 parses the broadcast range information, determines that the broadcast range information is layer 0, and MN0 sends snpDVM requests to RN0-1 to RN0-n in layer 0.
  • RNO-1 to RNO-n may perform the entry information invalidation operation in the TLB in the manner described in the corresponding embodiment of FIG. 2, and send Resp information to MNO, and MNO collects the Resp of RNO-1 to RNO-n in layer 0 information, and send Resp information to RN-0.
  • FIG. 5 shows a schematic diagram of a fourth virtualization system architecture to which the method of this embodiment of the present application is applied.
  • the broadcast range information indicates the broadcast range
  • the broadcast range is indicated by layers as granularity, and multiple layers indicated in the broadcast range information are managed by multiple different MNs.
  • the virtualization system includes RN-0, MN3, MN4, RN3-1 to RN3-n, and RN4-1 to RN4-n.
  • RN3-1 to RN3-n belong to layer 3
  • RN4-1 to RN4-n belong to layer 4
  • MN3 is used to manage RN3-1 to RN3-n
  • MN4 is used to manage RN4-1 to RN4-n.
  • the process of maintaining memory consistency in a virtualization system is described below by taking RN-0 modifying the entry information of the shared page table and initiating maintaining the consistency of entry information in the TLB of each physical core as an example.
  • the number of layers indicated in the broadcast range information is multiple, and the multiple layers are managed by different MNs.
  • the layers indicated in the broadcast range information include layers 3 and 4, the nodes of layer 3 are managed by MN3, and the nodes of layer 4 are managed by MN4.
  • RN-0 sends the DVM request to MN3, MN3 parses the broadcast range information, and determines that the broadcast range information includes not only RN3-1 to RN3-n in layer 3 corresponding to MN3, but also the corresponding information of MN4. RN4-1 to RN4-n within layer 4.
  • MN3 broadcasts the snpDVM request to RN3-1 to RN3-n, and MN3 sends the DVM request to MN4, MN3 parses the broadcast range information, determines that the broadcast range information is RN3-1 to RN3-n in layer 3, and MN3 sends the layer In 3, RN3-1 to RN3-n broadcast the snpDVM request.
  • RN3-1 to RN3-n may perform the entry information invalidation operation in the TLB and send Resp information to MN3 in the manner described in the embodiment corresponding to FIG. 2 .
  • RN4-1 to RN4-n may perform the entry information invalidation operation in the TLB and send Resp information to MN4 in the manner described in the embodiment corresponding to FIG. 2 .
  • MN4 After MN4 collects the Resp information of RN4-1 to RN4-n, it can send the Resp information to MN3, and MN3 collects the Resp information of RN3-1 to RN3-n and MN4, and sends the Resp information to RN-0.
  • FIG. 5 shows the situation where there are two MNs corresponding to the layer indicated by the broadcast range information.
  • the number of MNs may be greater than 2.
  • Each MN can transmit DVM requests to each other to realize the broadcast range.
  • the embodiment of this application does not limit the number of MNs and the communication mode between multiple MNs.
  • the broadcast range of the MN itself may cover a large number of nodes such as RN-1 to RN-n, but due to the limitation of the broadcast range information, the MN can broadcast the snpDVM request to RN-1 , instead of sending snpDVM requests to RN-2 to RN-n, when maintaining page table consistency, MN does not need to wait for the replies from RN-2 to RN-n, which can save waiting time, improve maintenance efficiency, and reduce The amount of calculation of the system can save computing resources.
  • the waiting time of the MN can be reduced, the maintenance efficiency can be improved, the calculation amount of the system can be reduced, and computing resources can be saved.
  • TLBI can also be replaced with instructions such as cache maintenance instruction (instruction cache maintenance instruction, IC), and the IC instruction is used to implement the above functions of TLBI, which will not be repeated here.
  • the reason for triggering memory consistency maintenance can also be that RN-0 modifies the entry information in its own TLB. For example, if RN-0 clears the entry information in its own TLB, then RN-0 0 can also request the relevant physical cores in the virtualization system to perform memory consistency maintenance. The process of memory consistency maintenance is detailed in the above description, and will not be repeated here.
  • the MN when the MN broadcasts the snpDVM instruction to the node indicated by the broadcast range information, it can be sent in a covering manner, wherein the covering method is to cover the physical cores of the nodes other than those indicated by the broadcast range information, The physical cores of nodes other than those indicated by the broadcast range information cannot receive the snpDVM instruction.
  • FIG. 6 shows a schematic flowchart of data synchronization in a multi-core system provided by the present application. As shown in FIG. 6 , the method of the embodiment of the present application includes:
  • the instruction may be used to instruct the first management node to perform a certain operation, which may be the smallest functional unit of operation.
  • the instruction can be TLBI, or it can be an instruction such as IC.
  • RN-0 can generate instructions that can be used to instruct the maintenance of memory coherency of the virtualized system.
  • the requesting node packages the instruction and broadcast range information into a DVM request.
  • the broadcast range information is used to indicate the broadcast range of the first management node.
  • the broadcast range information may be used to indicate the nodes that the first management node needs to broadcast, and so on.
  • the broadcast range information may be set in a register, or may be set in other storage devices, which is not specifically limited in this embodiment of the present application.
  • the DVM request is used to request the first management node to maintain the consistency of the TLB in the physical core.
  • the DVM request carries broadcast range information.
  • the RN-0 running on the physical core #1 can package the broadcast range information and the TLBI into a DVM request according to the TLBI.
  • the requesting node sends a DVM request to the first management node.
  • the first management node receives the DVM request from the requesting node.
  • the possible implementation of the requesting node sending the DVM request to the first management node is as follows: the requesting node sends the DVM request to the first management node through the bus, if applicable, the first management node can receive the DVM request from the requesting node through the bus. ask.
  • the bus protocol it is recognized in the bus protocol that the DVM request includes the broadcast range information, so the bus can transmit the DVM request including the broadcast range information.
  • the RN-0 when performing page table consistency maintenance, the RN-0 sends a DVM request to the MN through the CHI bus, and the MN A DVM request from RN-0 is received through the CHI bus, and the DVM request includes broadcast range information.
  • the first management node parses the broadcast range information to obtain M target nodes, where M is a positive integer.
  • the target node refers to the node where the physical core indicated by the broadcast range information is located.
  • the MN can parse the broadcast range information, and can obtain M nodes that need to be broadcast indicated by the broadcast range information, as M target nodes. .
  • the first management node sends broadcast information to the M target nodes.
  • the broadcast information is used to instruct the target node to perform memory consistency maintenance, for example, the broadcast information may be the snpDVM request in the above embodiment.
  • the memory consistency maintenance may refer to keeping the data in the target node consistent with the data in the first management node. For example, after RN-0 modifies the shared page table, the target node RN-1, the target node RN-2, and the target node RN-3 need to be updated to the modified shared page table in their respective TLBs. After receiving the broadcast information from the first management node, the M target nodes can perform memory consistency maintenance respectively. For example, any target node can invalidate the entry information currently stored in the TLB, access the updated shared page table from the memory system again, and store the updated shared page table in the TLB.
  • the first management node collects M pieces of response information from the M target nodes.
  • the response information is used to indicate that the target node completes data synchronization.
  • RN-1 sends response information to the MN, and adaptively, the MN collects the response information from RN-1.
  • the first management node may collect the response information from the target node RN-2 until the MN has collected the response information from the M target nodes.
  • the time at which each target node performs data synchronization may be different. Therefore, the timing at which each target node sends response information to the first management node may also be different. During this process, the first management node needs to wait until the M is collected. The response information of each node, the first management node can confirm that the memory consistency maintenance is completed.
  • the first management node may also forward the DVM request to the second management node.
  • the second management node may be the MN2 node or the MN4 node, then the second management node may receive the DVM request from the first management node, and collect the corresponding data of the second management node according to the DVM request. the target node.
  • the first management node sends information for indicating completion of data synchronization to the requesting node.
  • the requesting node receives information from the first management node indicating that the maintenance of memory consistency is completed.
  • the information used to indicate the completion of memory consistency maintenance may be response information sent by the first management node to the requesting node to indicate the completion of memory consistency maintenance after collecting response information from the M target nodes.
  • the information used to indicate that the memory consistency maintenance is completed may be in the form of numbers or characters, which is not specifically limited in this embodiment of the present application.
  • the requesting node can receive the information indicating that the memory consistency maintenance is completed, and confirm that the memory consistency maintenance is completed.
  • the request when the requesting node sends a request to the first management node, the request includes the broadcast range information, which can narrow the broadcast range of the first management node, and the subsequent request node sends a request to the corresponding broadcast range information.
  • Sending broadcast information by a node can reduce the time that the requesting node waits for invalid operations of the node corresponding to the broadcast range information, improve the efficiency of maintaining memory consistency, and the node corresponding to the broadcast range information does not need to interrupt the task, and does not affect the node corresponding to the broadcast range information. performance.
  • the broadcast range information may include: an identifier of a physical core related to a virtual machine running in the requesting node, and/or a virtual machine running in the requesting node The identification of the relevant layer; wherein, the layer corresponds to multiple physical cores.
  • the number of virtual machines running in the requesting node may be one or multiple.
  • the number of VM-related physical cores may be one or multiple.
  • the identifier of the physical core is used to clearly identify the physical core.
  • the identifier of the physical core may be the serial number, address, or name of the physical core, which is not specifically limited in this embodiment of the present application.
  • the number of VM-related layers may be one or multiple.
  • the identifier of the layer is used to clearly identify the layer.
  • the identifier of the layer may be the serial number or name of the layer, which is not specifically limited in this embodiment of the present application.
  • the broadcast range information may include the identifier of the physical core related to the virtual machine running in the request node, or the broadcast range information may include the identifier of the layer related to the virtual machine running in the request node, or the broadcast range information may include the request The identification of the physical core and the identification of the layer related to the virtual machine running in the node. Then the subsequent first management node may parse the broadcast range information, and send the broadcast information to the physical core or layer indicated by the broadcast range information.
  • the broadcast range information may be set in a register of the requesting node.
  • the register may be a specially designed broadcast range register for storing broadcast range information, or may be any register in the requesting node, which is not specifically limited in this embodiment of the present application.
  • FIG. 7 shows a schematic diagram of a logical architecture of an MN parsing a DVM request.
  • a register 701 can be set in the MN700.
  • the MN After the MN receives the DVM request carrying the broadcast range information from the requesting node (sco physical core), it can compare the broadcast range information delivered by the bus with the node managed by the MN indicated in the register 701. If the broadcast range information corresponds to the MN If the node matches, the broadcast information will be forwarded to the matching node.
  • the broadcast information may be, for example, a snpDVM request, and the snpDVM request includes a DVM code corresponding to the broadcast range information.
  • bitmaps (bitmaps) of physical cores may be set in the MN, and each bitmap indicates a corresponding RN.
  • the MN After receiving the DVM request carrying the broadcast range information from the requesting node, the MN forwards the broadcast information to the RN specified by the bitmap information according to the bitmap information, for example, the broadcast information may be the snpDVM request.
  • S605 includes: the first management node sends broadcast information to M target nodes in a covering manner, wherein the covering manner is The way the node is covered.
  • the covering mode may refer to covering the unused physical cores when the first management node sends the broadcast request.
  • the MN may divide the node 1-node m, except The physical cores of other nodes except node 1, node 5 and node 6 are blocked, and broadcast information is sent. Because the physical cores of the other nodes are blocked, the other nodes will not receive the broadcast information, and the MN will not receive the broadcast information. It is necessary to wait for the reply of the other node about maintaining the page table consistency, so the waiting time of the MN can be saved, and the calculation amount of the system can be reduced.
  • FIG. 8 is a schematic flowchart of a specific data synchronization method in a multi-core system provided by the present application. As shown in FIG. 8 , the method of the embodiment of the present application includes:
  • software refers to a series of sets of data and instructions organized in a specific order.
  • the software may be CPU scheduler software.
  • the user can schedule the physical core #N running the VM through the CPU scheduler software, and pass the relevant physical core (that is, the broadcast range information) of the running of the VM through the custom instruction set architecture (instruction set architecture, ISA) instruction set.
  • the broadcast range register of physical core #N is updated, so that the broadcast range register includes the broadcast range information.
  • the requesting node generates a DVM request, and sends the DVM request through the bus.
  • the requesting node modifies the shared page table, or it is understood as the VM software update translation
  • the requesting node may generate a TLBI
  • the requesting node packages the TLBI and the broadcast range information into a DVM request
  • the broadcast range information may also be understood as a custom broadcast range domain.
  • the MN parses the broadcast range information in the DVM request, and the MN sends the broadcast information to the RN indicated by the broadcast range information in a targeted manner by covering.
  • the MN collects Resp information in a targeted manner.
  • the MN direction ally collects the Resp information of the RN indicated by the broadcast range information, and sends the Resp information to the requesting node after collecting the Resp information of the RN indicated by the broadcast range information.
  • the CPU scheduler software maintains the broadcast range register, and can update the broadcast range by self-definition. Subsequently, the MN sends broadcast information to the corresponding node by covering it, which can reduce the MN waiting for the invalid operation of the corresponding node. Time, reduce software maintenance costs and the difficulty of MN broadcasting, and improve the efficiency of maintenance consistency.
  • the data synchronization method of the embodiment of the present application can also be easily transplanted to other physical cores of the ARM architecture for implementation, which not only does not increase the design difficulty of the MN, but also transfers the maintenance scope function to the software, which can reduce the cost of the MN. Hardware burden.
  • Embodiments of the present application also provide a computer-readable storage medium.
  • the methods described in the above embodiments may be implemented in whole or in part by software, hardware, firmware or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
  • Computer-readable media can include both computer storage media and communication media and also include any medium that can transfer a computer program from one place to another.
  • the storage medium can be any target medium that can be accessed by a computer.
  • the computer readable medium may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium intended to carry or in an instruction or data structure
  • the required program code is stored in the form and can be accessed by the computer.
  • any connection is properly termed a computer-readable medium.
  • coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave
  • coaxial cable, fiber optic cable , twisted pair, DSL or wireless technologies such as infrared, radio and microwave
  • Disk and disc as used herein includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
  • the embodiments of the present application also provide a computer program product.
  • the methods described in the above embodiments may be implemented in whole or in part by software, hardware, firmware or any combination thereof. If implemented in software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the above-mentioned computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the above-mentioned method embodiments are generated.
  • the aforementioned computer may be a general purpose computer, a special purpose computer, a computer network, a base station, a terminal, or other programmable devices.

Abstract

Embodiments of the present application relate to the technical field of virtualization, and provide a virtualization system and a method for maintaining memory consistency in a virtualization system. The system comprises a request node and a first management node; the request node is used for sending a distributed virtual memory (DVM) request to the first management node when maintaining memory consistency of the virtualization system, the DVM request comprising broadcast range information; the first management node is used for parsing the DVM request to obtain the broadcast range information, the broadcast range information indicating information of M target nodes, and M being a positive integer; and the first management node is also used for sending broadcast information to each of the M target nodes, the broadcast information being used for instructing each target node to maintain the memory consistency. In this way, when the memory consistency is maintained, the broadcast range of an MN may be limited, the time that the MN waits for an invalid operation of a physical core is reduced, the physical core does not need to interrupt a task, and the performance of the physical core is not affected.

Description

虚拟化系统以及虚拟化系统中内存一致性维护方法Virtualized system and memory consistency maintenance method in virtualized system 技术领域technical field
本申请涉及虚拟化技术领域,尤其涉及一种虚拟化系统以及虚拟化系统中内存一致性维护方法。The present application relates to the field of virtualization technology, and in particular, to a virtualization system and a method for maintaining memory consistency in the virtualization system.
背景技术Background technique
随着基于进阶精简指令集机器(advanced risc machine,ARM)架构的处理器及相关软件生态的蓬勃发展,ARM架构在嵌入式、消费电子、大数据或云计算等领域得到较多应用。ARM架构支持多核系统,多核系统可以理解为,电子设备的处理器可以支持多个物理核(physical core)。With the vigorous development of processors and related software ecosystems based on the advanced risc machine (ARM) architecture, the ARM architecture has been widely used in embedded, consumer electronics, big data or cloud computing and other fields. The ARM architecture supports a multi-core system, and a multi-core system can be understood as a processor of an electronic device that can support multiple physical cores.
示例性的,图1示出了一种基于ARM架构的处理器在虚拟机(virtual machine,VM)应用中的场景示意图。如图1所示,该虚拟机应用场景中,可以包括四个层次的域(domain),域0包括一个物理核,域1包括4个物理核,域2包括2个域1,域3包括域2和其他的域(图中为示出),其中,域1至域4均可以对应各自的共享域(shareable domain),域1至域4各自共享域中可以用于存储调度各自运行需要的信息,信息的具体内容依据实际应用场景而定,在此不作限定。Exemplarily, FIG. 1 shows a scenario diagram of an ARM architecture-based processor in a virtual machine (virtual machine, VM) application. As shown in Figure 1, this virtual machine application scenario can include four levels of domains, domain 0 includes one physical core, domain 1 includes four physical cores, domain 2 includes two domain 1s, and domain 3 includes Domain 2 and other domains (shown in the figure), wherein, domains 1 to 4 can all correspond to their own shared domains (shareable domains), and the shared domains of domains 1 to 4 can be used to store and schedule their respective running needs The specific content of the information depends on the actual application scenario, which is not limited here.
一个虚拟机可以对应多个虚拟中央处理单元(virtual central processing unit,vCPU),例如图1中,虚拟机#1可以对应vCPU1-vCPU4,虚拟机#2可以对应vCPU5-vCPU8。虚拟机核CPU调度程序(virtual machine kernel CPU scheduler)可以调度1个vCPU运行在多个物理核上,也可以调度多个vCPU运行在一个物理核上,图1中仅示出了一个vCPU运行在一个物理核的示例,并不对vCPU具体运行的物理核造成限定。可以理解的是,图1的虚拟机应用场景中,除了虚拟机#1和虚拟机#2,还可以包括其他的虚拟机,如VMx等,VMx的架构与虚拟机#1或虚拟机#2类似,不再赘述。A virtual machine may correspond to multiple virtual central processing units (vCPUs). For example, in Figure 1, virtual machine #1 may correspond to vCPU1-vCPU4, and virtual machine #2 may correspond to vCPU5-vCPU8. The virtual machine kernel CPU scheduler can schedule one vCPU to run on multiple physical cores, or multiple vCPUs to run on one physical core. Figure 1 shows only one vCPU running on one physical core. An example of a physical core does not limit the physical core on which the vCPU runs. It can be understood that in the virtual machine application scenario of FIG. 1, in addition to virtual machine #1 and virtual machine #2, other virtual machines, such as VMx, etc., may also be included. The architecture of VMx is the same as that of virtual machine #1 or virtual machine #2. similar, and will not be repeated here.
如图1所示,该虚拟机应用场景中还包括存储(memory)系统和其他节点(miscellaneous node,MN)。Memory系统用于存储memory页表等,memory页表又可以称为分页表(page table),memory页表中可以存储虚拟地址(virtual address,VA)到物理地址(physical address,PA)的映射,其中,VA在访问进程中是唯一的,PA在硬件中是唯一的。在虚拟机应用场景中,各物理核先从memory系统中获取与各物理核相关的memory页表,各物理核将获取的memory页表的映射存储在各自的转译后备缓冲器(translation lookaside buffer,TLB)中,后续当物理核需要得到VA与PA的映射时,物理核先从自己的TLB中搜索,如果物理核在自己的TLB中搜索不到需要的映射,物理核可以再到memory系统中搜索。当某个物理核对memory页表中的一项或多项映射条目(entry)的信息修改时,虚拟机应用场景中其他物理核的TLB中需要对修改的条目进行同步,以维持内存的一致性。As shown in Figure 1, the virtual machine application scenario also includes a storage (memory) system and other nodes (miscellaneous node, MN). The memory system is used to store memory page tables, etc. The memory page table can also be called a page table (page table), and the memory page table can store the mapping from virtual address (virtual address, VA) to physical address (physical address, PA), Among them, VA is unique in the access process, and PA is unique in hardware. In the virtual machine application scenario, each physical core first obtains the memory page table related to each physical core from the memory system, and each physical core stores the mapping of the obtained memory page table in its own translation lookaside buffer (translation lookaside buffer, TLB), when the physical core needs to obtain the mapping between VA and PA, the physical core first searches from its own TLB. If the physical core cannot search for the required mapping in its own TLB, the physical core can go to the memory system. search. When a physical core modifies the information of one or more mapping entries in the memory page table, the TLBs of other physical cores in the virtual machine application scenario need to synchronize the modified entries to maintain memory consistency .
MN用于管理多个物理核,例如,MN可以接收来自某一物理核的分布式虚拟内存(distributed virtual memory,DVM)请求,DVM请求可以用于维护虚拟机应用场景中内存的一致性等。MN可以向其他物理核发送探测(snoop,snp)DVM请求,snpDVM请求用于指示其他物理核维护各自TLB的一致性。MN接收其他物理核的回复 (response,Resp)信息,以及MN向发起DVM请求的物理核返回Resp信息。其中,各物理核维护TLB一致性可以理解为,各物理核将TLB中,发生修改的条目设置无效,则后续,如果物理核需要得到被无效的条目内容时,物理核从memory系统中获取。The MN is used to manage multiple physical cores. For example, the MN can receive distributed virtual memory (DVM) requests from a certain physical core, and the DVM requests can be used to maintain memory consistency in virtual machine application scenarios. The MN may send a snoop (snp) DVM request to other physical cores, and the snpDVM request is used to instruct other physical cores to maintain the consistency of their respective TLBs. The MN receives response (Resp) information from other physical cores, and the MN returns Resp information to the physical core that initiated the DVM request. Among them, each physical core maintains TLB consistency can be understood as, each physical core invalidates a modified entry in the TLB, and subsequently, if the physical core needs to obtain the content of the invalid entry, the physical core obtains it from the memory system.
需要说明的是,图1的虚拟机应用场景可以应用在某个电子设备中,则图1中所示的物理核、memory系统和MN均可以包括于该电子设备中。图1的虚拟机应用场景也可以应用在分布式系统中,分布式系统中可以包括多个电子设备,则图1中所示的物理核、memory系统和MN可以分布设置于不同的电子设备中。其中,MN也可以称为管理节点,MN可以是用于实现管理功能的一个物理核,也能可以是用于实现管理功能的多个物理核等,本申请实施例对MN的具体硬件实现不作限定。It should be noted that, the virtual machine application scenario of FIG. 1 can be applied to a certain electronic device, and the physical core, memory system and MN shown in FIG. 1 can all be included in the electronic device. The virtual machine application scenario shown in FIG. 1 can also be applied to a distributed system. The distributed system may include multiple electronic devices, and the physical core, memory system, and MN shown in FIG. 1 may be distributed and set in different electronic devices. . The MN may also be referred to as a management node, and the MN may be a physical core used to implement a management function, or multiple physical cores used to implement the management function. limited.
可能的实现中,各物理核维护TLB一致性时,可以采用处理器间中断(inter processor interrupt,IPI)的方式。具体的,当某个物理核对共享内存中的memory页表中的entry信息做了修改,除了本物理核需要完成自身TLB中的entry信息无效操作,本物理核还需要经过MN向其他物理核发送IPI指令,IPI指令中包括指示维护TLB一致性的信息,其他物理核在IPI中断处理函数中无效掉各自的TLB的entry信息,达到维护共享内存页表一致性的目的。In a possible implementation, when each physical core maintains TLB consistency, an inter-processor interrupt (IPI) method may be used. Specifically, when a physical core modifies the entry information in the memory page table in the shared memory, in addition to the fact that the physical core needs to complete the invalidation operation of the entry information in its own TLB, the physical core also needs to send the MN to other physical cores. IPI instruction. The IPI instruction includes information indicating the maintenance of TLB consistency. Other physical cores invalidate their respective TLB entry information in the IPI interrupt processing function to achieve the purpose of maintaining shared memory page table consistency.
但是,上述可能的实现中使用了IPI的方式,当物理核收到IPI请求时会暂停当前执行的任务,响应中断,执行完成页表同步后再回到原程序,该过程会影响物理核的性能。However, the IPI method is used in the above possible implementation. When the physical core receives the IPI request, it will suspend the currently executing task, respond to the interrupt, execute the page table synchronization and then return to the original program. This process will affect the physical core. performance.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种虚拟化系统以及虚拟化系统中内存一致性维护方法,维护内存一致性时,可以限制MN的广播范围,减少MN等待物理核的无效操作的时间,且物理核不需要中断任务,不影响物理核的性能。The embodiments of the present application provide a virtualization system and a method for maintaining memory consistency in the virtualization system. When maintaining memory consistency, the broadcast range of the MN can be limited, the time for the MN to wait for an invalid operation of the physical core is reduced, and the physical core does not need to Interrupt tasks without affecting the performance of physical cores.
第一方面,本申请实施例提供一种虚拟化系统,包括请求节点以及第一管理节点;请求节点,用于在进行虚拟化系统的内存一致性维护时,向第一管理节点发送分布式虚拟内存DVM请求,DVM请求包括广播范围信息;第一管理节点,用于解析DVM请求以获取广播范围信息,广播范围信息指示M个目标节点的信息;M为正整数;第一管理节点,还用于向M个目标节点中每个目标节点发送广播信息;广播信息用于指示每个目标节点进行内存一致性维护。In a first aspect, an embodiment of the present application provides a virtualization system, including a request node and a first management node; the request node is configured to send a distributed virtual system to the first management node when performing memory consistency maintenance of the virtualization system. Memory DVM request, the DVM request includes broadcast range information; the first management node is used to parse the DVM request to obtain the broadcast range information, and the broadcast range information indicates the information of M target nodes; M is a positive integer; It is used to send broadcast information to each of the M target nodes; the broadcast information is used to instruct each target node to perform memory consistency maintenance.
基于此,本申请实施例在维护虚拟化系统中内存一致性时,可以限制第一管理节点的广播范围,减少第一管理节点等待物理核的无效操作的时间,且物理核不需要中断任务,不影响物理核的性能。Based on this, when maintaining memory consistency in the virtualization system, the embodiments of the present application can limit the broadcast range of the first management node, reduce the time that the first management node waits for the invalid operation of the physical core, and the physical core does not need to interrupt the task, Does not affect the performance of physical cores.
在一种可能的实现方式中,广播范围信息包括:请求节点中运行的虚拟机相关的物理核的标识,和/或,请求节点中运行的虚拟机相关的层的标识;其中,层对应多个物理核。这样,本申请实施例的广播范围信息在指示广播范围时,可以以物理核为粒度指示广播范围,实现精准的广播范围限定,也可以以层为粒度指示广播范围,一个层的标识可以对应多个物理核,因此可以通过较少的标识实现对多个物理核的指示,节约系统资源。In a possible implementation manner, the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or an identifier of a layer related to the virtual machine running in the requesting node; wherein the layers correspond to multiple physical core. In this way, when the broadcast range information in this embodiment of the present application indicates the broadcast range, the physical core can be used as the granularity to indicate the broadcast range to achieve precise broadcast range limitation, and the broadcast range can also be indicated by the layer as the granularity. The identifier of one layer can correspond to multiple Therefore, the indication of multiple physical cores can be realized through fewer identifiers, which saves system resources.
在一种可能的实现方式中,第一管理节点,具体用于在解析到广播范围信息指示 的为M个物理核的标识时,向M个目标节点中每个目标节点发送广播信息;或者,第一管理节点,具体用于在解析到广播范围信息指示的为层的标识时,向层中的M个目标节点中每个目标节点发送广播信息。这样,可以限制第一管理节点的广播范围,减少第一管理节点等待物理核的无效操作的时间。In a possible implementation manner, the first management node is specifically configured to send broadcast information to each of the M target nodes when the identifier of the M physical cores indicated by the broadcast range information is parsed; or, The first management node is specifically configured to send broadcast information to each of the M target nodes in the layer when the identifier of the layer indicated by the broadcast range information is parsed. In this way, the broadcast range of the first management node can be limited, and the time for the first management node to wait for the invalid operation of the physical core can be reduced.
在一种可能的实现方式中,广播范围信息指示的节点中包括属于第二管理节点管理的节点;第一管理节点,还用于向第二管理节点发送DVM请求;第二管理节点,用于解析DVM请求以获取广播范围信息,并向属于第二管理节点管理的节点发送广播信息。这样,在第二管理节点向第二管理节点管理的节点发送广播信息时,可以实现精准的广播范围限定。In a possible implementation manner, the nodes indicated by the broadcast range information include nodes managed by the second management node; the first management node is further configured to send a DVM request to the second management node; the second management node is configured to The DVM request is parsed to obtain broadcast range information, and broadcast information is sent to nodes managed by the second management node. In this way, when the second management node sends broadcast information to a node managed by the second management node, accurate broadcast range limitation can be achieved.
在一种可能的实现方式中,广播范围信息指示的层中包括属于第二管理节点管理的层;第一管理节点,还用于向第二管理节点发送DVM请求;第二管理节点,用于解析DVM请求以获取广播范围信息,并向属于第二管理节点管理的层中的节点发送广播信息。这样,在第二管理节点向第二管理节点管理的层中的节点发送广播信息时,可以通过较少的标识实现对多个节点的指示,节约系统资源。In a possible implementation manner, the layers indicated by the broadcast range information include a layer managed by the second management node; the first management node is further configured to send a DVM request to the second management node; the second management node is configured to The DVM request is parsed to obtain the broadcast range information, and the broadcast information is sent to the nodes in the layer managed by the second management node. In this way, when the second management node sends the broadcast information to the nodes in the layer managed by the second management node, the indication to multiple nodes can be implemented by using fewer identifiers, thereby saving system resources.
在一种可能的实现方式中,请求节点中设置有寄存器;寄存器用于存储广播范围信息;请求节点,具体用于在进行虚拟化系统的内存一致性维护时,生成指令,将指令和广播范围信息打包为DVM请求,以及向第一管理节点发送DVM请求。这样,在请求节点中设置寄存器存储广播范围信息,可以自定义的更新广播范围,且请求节点可以将指令和该广播范围信息打包为DVM请求向第一管理节点发送,方便快捷。In a possible implementation manner, the requesting node is provided with a register; the register is used to store broadcast range information; the requesting node is specifically used to generate an instruction when performing memory consistency maintenance of the virtualization system, and combine the instruction with the broadcast range The information is packaged as a DVM request, and the DVM request is sent to the first management node. In this way, by setting a register in the requesting node to store the broadcast range information, the broadcast range can be updated by definition, and the requesting node can package the instruction and the broadcast range information into a DVM request to send to the first management node, which is convenient and quick.
在一种可能的实现方式中,指令包括转译后备缓冲器指令TLBI或缓存维护指令IC指令。这样,这样,转译后备缓冲器指令TLBI可以对物理核内的TLB的一致性进行维护,缓存维护指令IC指令也可以对物理核内的TLB的一致性进行维护。In one possible implementation, the instruction includes a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction. In this way, the translation lookaside buffer instruction TLBI can maintain the consistency of the TLB in the physical core, and the cache maintenance instruction IC instruction can also maintain the consistency of the TLB in the physical core.
在一种可能的实现方式中,第一管理节点,还用于收集来自M个目标节点的M个响应信息;各目标节点的响应信息用于表示目标节点完成内存一致性维护;第一管理节点,还用于向请求节点发送用于指示内存一致性维护完成的信息。这样,第一管理节点可以收集并发送用于指示内存一致性维护完成的信息。In a possible implementation manner, the first management node is further configured to collect M response information from M target nodes; the response information of each target node is used to indicate that the target node completes memory consistency maintenance; the first management node , and is also used to send information indicating the completion of memory consistency maintenance to the requesting node. In this way, the first management node can collect and transmit information indicating that the memory consistency maintenance is completed.
在一种可能的实现方式中,请求节点,还用于在预设时间内发出多条DVM请求时,生成数据同步隔离DSB指令,DSB指令用于指示第一管理节点在收集完成多条DVM请求对应的节点的响应信息后,同步向第一管理节点发送用于指示内存一致性维护完成的信息。这样,请求节点可以生成DSB指令,表示多条DVM请求对应的节点的内存一致性维护完成。In a possible implementation manner, the requesting node is further configured to generate a data synchronization isolation DSB instruction when multiple DVM requests are sent within a preset time, and the DSB instruction is used to instruct the first management node to collect and complete multiple DVM requests After receiving the response information from the corresponding node, the information for indicating the completion of the memory consistency maintenance is synchronously sent to the first management node. In this way, the requesting node can generate a DSB instruction, indicating that the maintenance of the memory consistency of the nodes corresponding to the multiple DVM requests is completed.
在一种可能的实现方式中,第一管理节点,具体用于采用遮盖方式向M个目标节点中每个目标节点发送广播信息;其中,遮盖方式为将除M个目标节点外的节点进行遮盖的方式。这样,可以将除目标节点外的其他的节点的物理核遮住,则该其他的节点不会收到广播信息,因此可以节约第一管理节点的等待时间,并减少系统的运算量。In a possible implementation manner, the first management node is specifically configured to send broadcast information to each of the M target nodes in a covering manner; wherein the covering manner is to cover nodes other than the M target nodes The way. In this way, the physical cores of other nodes except the target node can be covered, and the other nodes will not receive the broadcast information, so the waiting time of the first management node can be saved, and the calculation amount of the system can be reduced.
第二方面,本申请实施例提供一种虚拟化系统中内存一致性维护方法,包括:请求节点在进行虚拟化系统的内存一致性维护时,向第一管理节点发送分布式虚拟内存DVM请求,DVM请求包括广播范围信息;第一管理节点解析DVM请求以获取广播范围信息,广播范围信息指示M个目标节点的信息;M为正整数;第一管理节点向M个目 标节点中每个目标节点发送广播信息;广播信息用于指示每个目标节点进行内存一致性维护。In a second aspect, an embodiment of the present application provides a method for maintaining memory consistency in a virtualization system, including: when a requesting node performs memory consistency maintenance in a virtualization system, sending a distributed virtual memory DVM request to a first management node, The DVM request includes broadcast range information; the first management node parses the DVM request to obtain broadcast range information, and the broadcast range information indicates the information of M target nodes; M is a positive integer; the first management node sends a message to each of the M target nodes. Send broadcast information; broadcast information is used to instruct each target node to maintain memory consistency.
在一种可能的实现方式中,广播范围信息包括:请求节点中运行的虚拟机相关的物理核的标识,和/或,请求节点中运行的虚拟机相关的层的标识;其中,层对应多个物理核。In a possible implementation manner, the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or an identifier of a layer related to the virtual machine running in the requesting node; wherein the layers correspond to multiple physical core.
在一种可能的实现方式中,第一管理节点向M个目标节点中每个目标节点发送广播信息,包括:第一管理节点在解析到广播范围信息指示的为M个物理核的标识时,向M个目标节点中每个目标节点发送广播信息;或者,第一管理节点在解析到广播范围信息指示的为层的标识时,向层中的M个目标节点中每个目标节点发送广播信息。In a possible implementation manner, the first management node sends broadcast information to each of the M target nodes, including: when the first management node parses the identifiers of the M physical cores indicated by the broadcast range information, Send broadcast information to each of the M target nodes; or, when the first management node parses the identifier of the layer indicated by the broadcast range information, sends broadcast information to each of the M target nodes in the layer. .
在一种可能的实现方式中,广播范围信息指示的节点中包括属于第二管理节点管理的节点,和/或,属于第二管理节点管理的层,方法还包括:第一管理节点向第二管理节点发送DVM请求。In a possible implementation manner, the nodes indicated by the broadcast range information include nodes managed by the second management node, and/or layers managed by the second management node, and the method further includes: the first management node sends the second management node to the second management node. The management node sends a DVM request.
在一种可能的实现方式中,请求节点在进行虚拟化系统的内存一致性维护时,向第一管理节点发送分布式虚拟内存DVM请求,包括:请求节点在进行虚拟化系统的内存一致性维护时,生成指令;请求节点将指令和广播范围信息打包为DVM请求;请求节点向第一管理节点发送DVM请求。In a possible implementation manner, the requesting node sends a distributed virtual memory DVM request to the first management node when performing the memory consistency maintenance of the virtualization system, including: the requesting node is performing the memory consistency maintenance of the virtualization system When , an instruction is generated; the requesting node packages the instruction and the broadcast range information into a DVM request; the requesting node sends the DVM request to the first management node.
在一种可能的实现方式中,指令包括转译后备缓冲器指令TLBI或缓存维护指令IC指令。In one possible implementation, the instruction includes a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction.
在一种可能的实现方式中,第一管理节点收集来自M个目标节点的M个响应信息;各目标节点的响应信息用于表示目标节点完成内存一致性维护;第一管理节点向请求节点发送用于指示内存一致性维护完成的信息。In a possible implementation manner, the first management node collects M pieces of response information from M target nodes; the response information of each target node is used to indicate that the target node has completed memory consistency maintenance; the first management node sends a message to the requesting node Information used to indicate that memory consistency maintenance is complete.
在一种可能的实现方式中,请求节点在预设时间内发出多条DVM请求时,生成数据同步隔离DSB指令,DSB指令用于指示第一管理节点在收集完成多条DVM请求对应的节点的响应信息后,同步向第一管理节点发送用于指示内存一致性维护完成的信息。In a possible implementation manner, when the requesting node sends multiple DVM requests within a preset time, a data synchronization isolation DSB instruction is generated, and the DSB instruction is used to instruct the first management node to collect and complete the data of the nodes corresponding to the multiple DVM requests. After responding to the information, synchronously sends to the first management node information indicating that the memory consistency maintenance is completed.
在一种可能的实现方式中,第一管理节点向M个目标节点中每个目标节点发送广播信息,包括:第一管理节点采用遮盖方式向M个目标节点中每个目标节点发送广播信息;其中,遮盖方式为将除M个目标节点外的节点进行遮盖的方式。In a possible implementation manner, the first management node sends broadcast information to each of the M target nodes, including: the first management node sends broadcast information to each of the M target nodes in a covering manner; The covering method is a method of covering nodes other than the M target nodes.
第三方面,本申请实施例提供一种虚拟化系统中内存一致性维护方法,包括:第一管理节点接收来自请求节点的分布式虚拟内存DVM请求,DVM请求包括广播范围信息;第一管理节点解析DVM请求以获取广播范围信息,广播范围信息指示M个目标节点的信息;M为正整数;第一管理节点向M个目标节点中每个目标节点发送广播信息;广播信息用于指示每个目标节点进行内存一致性维护。In a third aspect, an embodiment of the present application provides a method for maintaining memory consistency in a virtualization system, including: a first management node receives a distributed virtual memory DVM request from a requesting node, where the DVM request includes broadcast range information; the first management node Parse the DVM request to obtain broadcast range information, the broadcast range information indicates the information of M target nodes; M is a positive integer; the first management node sends broadcast information to each of the M target nodes; the broadcast information is used to indicate each The target node performs memory consistency maintenance.
在一种可能的实现方式中,广播范围信息包括:请求节点中运行的虚拟机相关的物理核的标识,和/或,请求节点中运行的虚拟机相关的层的标识;其中,层对应多个物理核。In a possible implementation manner, the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or an identifier of a layer related to the virtual machine running in the requesting node; wherein the layers correspond to multiple physical core.
在一种可能的实现方式中,第一管理节点向M个目标节点中每个目标节点发送广播信息,包括:第一管理节点在解析到广播范围信息指示的为M个物理核的标识时,向M个目标节点中每个目标节点发送广播信息;或者,第一管理节点在解析到广播范围信息指示的为层的标识时,向层中的M个目标节点中每个目标节点发送广播信息。In a possible implementation manner, the first management node sends broadcast information to each of the M target nodes, including: when the first management node parses the identifiers of the M physical cores indicated by the broadcast range information, Send broadcast information to each of the M target nodes; or, when the first management node parses the identifier of the layer indicated by the broadcast range information, sends broadcast information to each of the M target nodes in the layer. .
在一种可能的实现方式中,广播范围信息指示的节点中包括属于第二管理节点管理的节点,和/或,属于第二管理节点管理的层,方法还包括:第一管理节点向第二管理节点发送DVM请求。In a possible implementation manner, the nodes indicated by the broadcast range information include nodes managed by the second management node, and/or layers managed by the second management node, and the method further includes: the first management node sends the second management node to the second management node. The management node sends a DVM request.
在一种可能的实现方式中,第一管理节点收集来自M个目标节点的M个响应信息;各目标节点的响应信息用于表示目标节点完成内存一致性维护;第一管理节点向请求节点发送用于指示内存一致性维护完成的信息。In a possible implementation manner, the first management node collects M pieces of response information from M target nodes; the response information of each target node is used to indicate that the target node has completed memory consistency maintenance; the first management node sends a message to the requesting node Information used to indicate that memory consistency maintenance is complete.
在一种可能的实现方式中,第一管理节点向M个目标节点中每个目标节点发送广播信息,包括:第一管理节点采用遮盖方式向M个目标节点中每个目标节点发送广播信息;其中,遮盖方式为将除M个目标节点外的节点进行遮盖的方式。In a possible implementation manner, the first management node sends broadcast information to each of the M target nodes, including: the first management node sends broadcast information to each of the M target nodes in a covering manner; The covering method is a method of covering nodes other than the M target nodes.
第四方面,本申请实施例提供一种虚拟化系统中内存一致性维护方法,包括:请求节点在进行虚拟化系统的内存一致性维护时,生成指令;请求节点将指令和设置在请求节点中的广播范围信息打包为DVM请求;请求节点向第一管理节点发送DVM请求。In a fourth aspect, an embodiment of the present application provides a method for maintaining memory consistency in a virtualized system, including: a requesting node generates an instruction when performing memory consistency maintenance in a virtualization system; the requesting node sets the instruction sum in the requesting node The broadcast range information is packaged into a DVM request; the requesting node sends the DVM request to the first management node.
在一种可能的实现方式中,广播范围信息包括:请求节点中运行的虚拟机相关的物理核的标识,和/或,请求节点中运行的虚拟机相关的层的标识;其中,层对应多个物理核。In a possible implementation manner, the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or an identifier of a layer related to the virtual machine running in the requesting node; wherein the layers correspond to multiple physical core.
在一种可能的实现方式中,指令包括转译后备缓冲器指令TLBI或缓存维护指令IC指令。In one possible implementation, the instruction includes a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction.
在一种可能的实现方式中,请求节点接收来自第一管理节点的用于指示内存一致性维护完成的信息。In a possible implementation manner, the requesting node receives information from the first management node indicating that the maintenance of memory consistency is completed.
在一种可能的实现方式中,请求节点在预设时间内发出多条DVM请求时,生成数据同步隔离DSB指令,DSB指令用于指示第一管理节点在收集完成多条DVM请求对应的节点的响应信息后,同步向第一管理节点发送用于指示内存一致性维护完成的信息。In a possible implementation manner, when the requesting node sends multiple DVM requests within a preset time, a data synchronization isolation DSB instruction is generated, and the DSB instruction is used to instruct the first management node to collect and complete the data of the nodes corresponding to the multiple DVM requests. After responding to the information, synchronously sends to the first management node information indicating that the memory consistency maintenance is completed.
第五方面,本申请实施例提供一种计算机可读存储介质,计算机可读存储介质包括计算机程序,当计算机程序在电子设备上运行时,使得电子设备执行如上述第三方面和第四方面任一可能设计的技术方案。In a fifth aspect, embodiments of the present application provide a computer-readable storage medium, where the computer-readable storage medium includes a computer program, and when the computer program is run on an electronic device, the electronic device is made to perform any of the above-mentioned third and fourth aspects. A possible design of the technical solution.
第六方面,为本申请实施例的一种计算机程序产品,计算机程序产品包括指令,当指令在计算机上运行时,使得计算机执行如上述第三方面和第四方面任一可能设计的技术方案。A sixth aspect is a computer program product according to an embodiment of the present application. The computer program product includes instructions, and when the instructions are executed on a computer, the computer can execute the technical solutions of any of the third and fourth aspects.
其中,第二方面至第六方面的有益效果,可请参见第一方面的有益效果,不重复赘述。For the beneficial effects of the second to sixth aspects, please refer to the beneficial effects of the first aspect, which will not be repeated.
附图说明Description of drawings
图1为本申请实施例提供的一种基于ARM架构的处理器在虚拟机应用中的场景示意图;1 is a schematic diagram of a scenario of an ARM architecture-based processor in a virtual machine application provided by an embodiment of the present application;
图2为本申请实施例的方法所应用的第一种系统架构示意图;2 is a schematic diagram of a first system architecture to which the method of the embodiment of the present application is applied;
图3为本申请实施例的方法所应用的第二种系统架构示意图;3 is a schematic diagram of a second system architecture to which the method of the embodiment of the present application is applied;
图4为本申请实施例的方法所应用的第三种系统架构示意图;FIG. 4 is a schematic diagram of a third system architecture to which the method of the embodiment of the present application is applied;
图5为本申请实施例的方法所应用的第四种系统架构示意图;FIG. 5 is a schematic diagram of a fourth system architecture to which the method of the embodiment of the present application is applied;
图6为本申请实施例提供的一种多核系统中的数据同步的流程示意图;6 is a schematic flowchart of data synchronization in a multi-core system provided by an embodiment of the present application;
图7为本申请实施例提供的一种MN节点解析DVM请求的逻辑架构示意图;7 is a schematic diagram of a logical architecture of an MN node parsing a DVM request according to an embodiment of the present application;
图8为本申请实施例提供的一种具体的多核系统中的数据同步方法的流程示意图。FIG. 8 is a schematic flowchart of a specific data synchronization method in a multi-core system according to an embodiment of the present application.
具体实施方式Detailed ways
为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。例如,第一事件和第二事件仅仅是为了区分不同的事件,并不对其先后顺序进行限定。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。In order to clearly describe the technical solutions of the embodiments of the present application, in the embodiments of the present application, words such as "first" and "second" are used to distinguish the same or similar items with basically the same function and effect. For example, the first event and the second event are only for distinguishing different events, and do not limit their order. Those skilled in the art can understand that the words "first", "second" and the like do not limit the quantity and execution order, and the words "first", "second" and the like are not necessarily different.
需要说明的是,本申请中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。It should be noted that, in this application, words such as "exemplary" or "for example" are used to represent examples, illustrations or illustrations. Any embodiment or design described in this application as "exemplary" or "such as" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present the related concepts in a specific manner.
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。In this application, "at least one" means one or more, and "plurality" means two or more. "And/or", which describes the association relationship of the associated objects, indicates that there can be three kinds of relationships, for example, A and/or B, which can indicate: the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects are an "or" relationship. "At least one item(s) below" or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
为便于理解本申请实施例,下面对本申请实施例中涉及到的一些词汇作简单说明。To facilitate understanding of the embodiments of the present application, some words involved in the embodiments of the present application are briefly described below.
1、物理核:可以是由单晶硅以一定的生产工艺制造出来的,用于执行计算、接收命令或存储命令、处理数据等步骤,每个物理核有各自独立的TLB。1. Physical core: It can be manufactured from single crystal silicon with a certain production process, and is used to perform steps such as calculation, receiving or storing commands, and processing data. Each physical core has its own independent TLB.
2、TLB维护指令(TLB maintenance instruction,TLBI):用于指示对物理核内的TLB的一致性进行维护的指令。2. TLB maintenance instruction (TLBI): an instruction used to instruct to maintain the consistency of the TLB in the physical core.
3、虚拟机:是指通过软件模拟的具有完整硬件系统功能的、运行在一个隔离环境中的完整计算机系统。3. Virtual machine: refers to a complete computer system with complete hardware system functions simulated by software and running in an isolated environment.
4、总线:指的是各个节点之间传送数据的公共通信干线,可以用于传送消息或请求等。4. Bus: Refers to the public communication trunk for transmitting data between various nodes, which can be used to transmit messages or requests.
5、广播范围寄存器:指的是用于存储广播范围信息的寄存器。5. Broadcast range register: refers to the register used to store broadcast range information.
6、请求节点(request node,RN):一个物理核可以对应一个RN,或者可以理解为RN为在物理核中运行vCPU以实现某种功能的节点。需要说明的是,RN与MN是相对的概念,一个MN可以管理多个RN,该多个RN可以对应一个虚拟机,也可以对应多个虚拟机。6. Request node (RN): A physical core can correspond to an RN, or it can be understood that an RN is a node that runs a vCPU in the physical core to achieve a certain function. It should be noted that RN and MN are relative concepts, one MN can manage multiple RNs, and the multiple RNs can correspond to one virtual machine or multiple virtual machines.
本申请实施例的方法可以应用到嵌入式、消费电子、大数据、汽车电子、海量存储、成像设备、工业控制、安全系统或云计算等领域的虚拟机应用场景中。该虚拟机应用场景中,执行本申请实施例的方法的电子设备包括处理器,处理器可以采用ARM架构,基于ARM架构的处理器具有高速度、低功耗、价格低等优点。The methods of the embodiments of the present application can be applied to virtual machine application scenarios in the fields of embedded, consumer electronics, big data, automotive electronics, mass storage, imaging equipment, industrial control, security systems, or cloud computing. In the virtual machine application scenario, the electronic device that executes the method of the embodiment of the present application includes a processor, and the processor may adopt an ARM architecture, and the processor based on the ARM architecture has the advantages of high speed, low power consumption, and low price.
其中,电子设备也可以称为终端设备、终端(terminal)、用户设备(user equipment,UE)、移动台(mobile station,MS)或移动终端(mobile terminal,MT)等。电子设 备可以是手机(mobile phone)、智能电视、穿戴式设备、平板电脑(Pad)、带无线收发功能的电脑、虚拟现实(virtual reality,VR)终端设备、增强现实(augmented reality,AR)终端设备、工业控制(industrial control)中的无线终端、无人驾驶(self-driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等等。本申请的实施例对电子设备所采用的具体技术和具体设备形态不做限定。The electronic device may also be referred to as a terminal device, a terminal (terminal), a user equipment (UE), a mobile station (mobile station, MS), or a mobile terminal (mobile terminal, MT). The electronic device can be a mobile phone (mobile phone), a smart TV, a wearable device, a tablet computer (Pad), a computer with a wireless transceiver function, a virtual reality (virtual reality, VR) terminal device, an augmented reality (augmented reality, AR) terminal Equipment, wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in remote medical surgery, wireless terminals in smart grid, transportation Wireless terminals in security (transportation safety), wireless terminals in smart cities, wireless terminals in smart homes, and so on. The embodiments of the present application do not limit the specific technology and specific device form adopted by the electronic device.
结合图1,在维护内存一致性的另一种可能的实现中,可以采用广播TLB无效操作的方式。具体的,当某个物理核对共享内存中memory页表中的一项或多项映射条目信息做了修改,该物理核产生TLBI,该物理核在完成本物理核的TLB无效操作的同时,还将TLBI打包为DVM请求,通过一致性总线接口(coherent hub interface,CHI)将DVM请求发送到MN,MN收到DVM请求后,ARM软件会默认打开一个强制广播(force broadcast,FB)的控制指令,该指令会被强制MN将DVM请求广播到内部共享(inner shareable,IS)范围内的所有物理核,MN向IS范围内的所有物理核广播snpDVM请求,该IS范围内的所有物理核根据snpDVM请求无效掉各自的TLB中相应的entry信息,MN收集到该IS范围内的所有物理核的Resp信息后,向发送DVM请求的物理核指示TLB维护完成,达到维护各物理核的TLB一致性的目的。需要说明的是,在通常的实现中,该IS范围在系统设计之初已定义,IS范围较大且不可更改。With reference to FIG. 1 , in another possible implementation of maintaining memory consistency, the way of broadcasting invalid operation of TLB can be adopted. Specifically, when a physical core modifies one or more mapping entries in the memory page table in the shared memory, the physical core generates a TLBI. While completing the TLB invalidation operation of the physical core, the physical core also Package the TLBI as a DVM request, and send the DVM request to the MN through the coherent hub interface (CHI). After the MN receives the DVM request, the ARM software will open a force broadcast (FB) control instruction by default. , this instruction will force the MN to broadcast the DVM request to all physical cores within the inner shareable (IS) range, and the MN broadcasts the snpDVM request to all the physical cores within the IS range. The request invalidates the corresponding entry information in the respective TLB. After the MN collects the Resp information of all physical cores within the IS range, it indicates the completion of the TLB maintenance to the physical core that sent the DVM request, so as to maintain the TLB consistency of each physical core. Purpose. It should be noted that, in a common implementation, the IS range is defined at the beginning of the system design, and the IS range is large and cannot be changed.
其中,为了保证snpDVM请求在该IS范围内的所有物理核上同步生效,MN可以执行DVM同步(synchronization,Sync)操作,使得上述的IS范围内的所有物理核在完成TLB中相应的entry信息无效后,同时向MN返回Resp信息。Among them, in order to ensure that the snpDVM request takes effect synchronously on all physical cores within the IS scope, the MN can perform a DVM synchronization (synchronization, Sync) operation, so that the corresponding entry information in the completion TLB is invalid for all physical cores within the IS scope. After that, the Resp information is returned to the MN at the same time.
但是,上述实现中MN需要无差别地等待IS范围内所有物理核的TLB中相应的entry信息无效操作结束,存在较长的等待时间和较大的性能开销。However, in the above implementation, the MN needs to indiscriminately wait for the corresponding entry information invalidation operation in the TLBs of all physical cores within the IS range to end, and there is a long waiting time and a large performance overhead.
有鉴于此,本申请实施例提供一种虚拟化系统以及虚拟化系统中内存一致性维护方法,维护内存一致性时,可以限制MN的广播范围,减少MN等待物理核的无效操作的时间,且物理核不需要中断任务,不影响物理核的性能。具体地,本申请实施例的RN向MN发送请求时,该请求中包括了广播范围信息,即缩小了MN的广播范围,则后续MN向该广播范围信息对应的节点发送广播信息后,等待各节点同步的时间适应缩短,提升维护一致性的效率,且各节点不需要中断各自的任务,不影响各节点的性能。In view of this, embodiments of the present application provide a virtualization system and a method for maintaining memory consistency in the virtualization system. When maintaining memory consistency, the broadcast range of the MN can be limited, the time for the MN to wait for an invalid operation of a physical core can be reduced, and The physical core does not need to interrupt tasks and does not affect the performance of the physical core. Specifically, when the RN in the embodiment of the present application sends a request to the MN, the request includes broadcast range information, that is, the broadcast range of the MN is reduced. After the MN sends the broadcast information to the node corresponding to the broadcast range information, it waits for each The time adaptation of node synchronization is shortened, the efficiency of maintaining consistency is improved, and each node does not need to interrupt its own tasks, which does not affect the performance of each node.
其中,本申请实施例的物理核中可以设置广播范围寄存器,广播范围寄存器中预设有广播范围信息。广播范围寄存器可以接收用户的设定,例如,用户可以预先利用CPU调度程序软件等在广播范围寄存器中设置或修改广播范围信息。Wherein, a broadcast range register may be set in the physical core of the embodiment of the present application, and broadcast range information is preset in the broadcast range register. The broadcast range register can receive user settings. For example, the user can set or modify the broadcast range information in the broadcast range register by using CPU scheduler software in advance.
本申请实施例中,MN被配置为能够解析DVM请求的广播范围信息的逻辑。一种可能的实现中,MN内可以设置类Affinity寄存器,该Affinity寄存器中可以设置用于指示该MN对应的RN,以及RN所在的系统层次(即层)。其中,该Affinity寄存器静态可配,例如可以在软件部署VM时进行配置。In this embodiment of the present application, the MN is configured with logic capable of parsing the broadcast range information requested by the DVM. In a possible implementation, a class Affinity register may be set in the MN, and the Affinity register may be set to indicate the RN corresponding to the MN and the system level (ie, layer) where the RN is located. The Affinity register is statically configurable, for example, it can be configured when software deploys a VM.
需要说明的是,本申请实施例的广播范围信息在指示广播范围时,可以以物理核 为粒度指示广播范围,例如,广播范围信息中可以包括需要广播的物理核的标识,则MN广播时,对广播范围信息指示的物理核进行广播。可能的实现中,可以定义范围为128比特(binary digit,Bit)的广播范围寄存器,每个Bit代表一个物理核,则根据广播范围寄存器中的广播范围信息指示MN的广播范围时,可以以物理核为粒度指示广播范围,MN可以根据广播范围信息的指示对物理核进行广播。可以理解的是,该实现中,因为广播范围信息以物理核为粒度,可以实现精准的广播范围限定。It should be noted that, when the broadcast range information in this embodiment of the present application indicates the broadcast range, the broadcast range may be indicated with a physical core as the granularity. For example, the broadcast range information may include the identifier of the physical core that needs to be broadcast. Broadcast to the physical core indicated by the broadcast range information. In a possible implementation, a broadcast range register with a range of 128 bits (binary digit, Bit) can be defined, and each Bit represents a physical core. When indicating the broadcast range of the MN according to the broadcast range information in the broadcast range register, the physical The core indicates the broadcast range for granularity, and the MN can broadcast the physical core according to the indication of the broadcast range information. It can be understood that, in this implementation, because the broadcast range information is based on the physical core as the granularity, accurate broadcast range limitation can be achieved.
本申请实施例的广播范围信息在指示广播范围时,也可以以层为粒度指示广播范围,一个层中可以包括多个物理核,层的划分规则本申请实施例不作限定,例如,广播范围信息中可以包括需要广播的层的标识,则MN广播时,对广播范围信息指示的层中的物理核均进行广播。可以理解的是,该实现中,因为广播范围信息以层为粒度,一个层的标识可以对应多个物理核,因此可以通过较少的标识实现对多个物理核的指示,节约系统资源。When the broadcast range information in this embodiment of the present application indicates the broadcast range, the broadcast range may also be indicated by a layer as a granularity. One layer may include multiple physical cores, and the layer division rules are not limited in this embodiment of the present application. For example, the broadcast range information can include the identifier of the layer that needs to be broadcast, and when the MN broadcasts, it broadcasts all physical cores in the layer indicated by the broadcast range information. It can be understood that, in this implementation, because the broadcast range information is granular in layers, the identifier of one layer can correspond to multiple physical cores, so the indication of multiple physical cores can be implemented with fewer identifiers, saving system resources.
可以理解的是,本申请实施例中,在广播范围信息中指示的节点均由一个MN管理时,本申请实施例的MN的数量可以为一个。在广播范围信息中指示的节点由多个不同的MN管理时,RN可以向其中一个MN发送该广播范围信息,MN除了向该MN管理的节点发送广播信息,还可以将广播范围信息发送给其他MN,使得该其他MN,也可以向该其他MN管理的节点发送广播信息。It can be understood that, in the embodiment of the present application, when the nodes indicated in the broadcast range information are all managed by one MN, the number of MNs in the embodiment of the present application may be one. When the node indicated in the broadcast range information is managed by multiple different MNs, the RN can send the broadcast range information to one of the MNs. In addition to sending the broadcast information to the node managed by the MN, the MN can also send the broadcast range information to other MNs. MN, so that the other MN can also send broadcast information to the node managed by the other MN.
示例性的,图2-图5示出了本申请实施例的四种可能的虚拟化系统架构示意图。Exemplarily, FIG. 2 to FIG. 5 show schematic diagrams of four possible virtualization system architectures according to the embodiments of the present application.
图2示出了本申请实施例的方法所应用的第一种虚拟化系统架构示意图。在图2的虚拟化系统框架示意图中,广播范围信息在指示广播范围时,以物理核为粒度指示广播范围,且广播范围信息中指示的节点均由一个MN管理。FIG. 2 shows a schematic diagram of a first virtualization system architecture to which the method of the embodiment of the present application is applied. In the schematic diagram of the virtualization system framework in FIG. 2 , when the broadcast range information indicates the broadcast range, the broadcast range is indicated with the physical core as the granularity, and the nodes indicated in the broadcast range information are all managed by one MN.
如图2所示,虚拟化系统包括RN-0、MN和RN-1至RN-n。As shown in FIG. 2, the virtualization system includes RN-0, MN, and RN-1 to RN-n.
其中,RN-0对应于物理核#1,物理核#1中设置有TLB和广播范围寄存器,广播范围寄存器中预设有广播范围信息。示例性的,预先可以通过虚拟机核CPU调度程序将广播范围信息存储到广播范围寄存器,该广播范围信息可以包括运行在RN-0中的VM所相关的物理核的标识。可以理解的是,在需要更新广播寄存器中的广播范围信息时,可以通过虚拟机核CPU调度程序调度更新代码来更新广播寄存器。Wherein, RN-0 corresponds to physical core #1, TLB and broadcast range register are set in physical core #1, and broadcast range information is preset in the broadcast range register. Exemplarily, the broadcast range information may be stored in the broadcast range register through the virtual machine core CPU scheduler in advance, where the broadcast range information may include the identifier of the physical core related to the VM running in the RN-0. It can be understood that, when the broadcast range information in the broadcast register needs to be updated, the update code can be scheduled by the virtual machine core CPU scheduler to update the broadcast register.
下面以RN-0对共享页表的entry信息做修改,并发起维护各物理核的TLB中entry信息一致性为例,说明虚拟化系统中维护内存一致性的过程。The process of maintaining memory consistency in a virtualization system is described below by taking RN-0 modifying the entry information of the shared page table and initiating maintaining the consistency of entry information in the TLB of each physical core as an example.
RN-0对共享页表的entry信息做修改后,RN-0中产生TLBI,该TLBI用于对物理核#1内的TLB进行维护,RN-0完成自身TLB中的entry信息无效操作,进一步的,RN-0还可以从memory系统中获取更新后的memory页表,设置在RN-0的TLB中,完成RN-0的共享页表更新。After RN-0 modifies the entry information of the shared page table, TLBI is generated in RN-0, and the TLBI is used to maintain the TLB in physical core #1. RN-0 completes the invalidation operation of entry information in its own TLB, and further Yes, RN-0 can also obtain the updated memory page table from the memory system, set it in the TLB of RN-0, and complete the update of the shared page table of RN-0.
此外,RN-0可以将广播范围信息和TLBI打包为DVM请求,则该DVM请求中包括了广播范围信息,RN-0将DVM请求通过CHI总线发送到MN,MN解析广播范围信息,得到广播范围信息所指示的物理核,该广播范围信息所指示的物理核也可以称为其他RN。In addition, RN-0 can package the broadcast range information and TLBI into a DVM request, then the DVM request includes the broadcast range information, RN-0 sends the DVM request to the MN through the CHI bus, and the MN parses the broadcast range information to obtain the broadcast range The physical core indicated by the information, and the physical core indicated by the broadcast range information may also be referred to as other RNs.
示例性的,如图2所示,该广播范围信息所指示的物理核可以为RN-1,MN向RN-1广播snpDVM请求。可以理解的是,该广播范围信息所指示的物理核的数量可 以根据实际场景确定,可以为一个或多个,本申请实施例对此不作具体限定。Exemplarily, as shown in FIG. 2 , the physical core indicated by the broadcast range information may be RN-1, and the MN broadcasts a snpDVM request to RN-1. It can be understood that the number of physical cores indicated by the broadcast range information may be determined according to an actual scenario, and may be one or more, which is not specifically limited in this embodiment of the present application.
RN-1基于该snpDVM请求可以对自身TLB中的entry信息进行无效操作,RN-1也可以进一步从memory系统中获取更新后的memory页表,以及向MN发送Resp信息。RN-1 can invalidate the entry information in its own TLB based on the snpDVM request, and RN-1 can also further obtain the updated memory page table from the memory system, and send Resp information to the MN.
一种可能的实现中,RN-0可能在一段时间内对memory页表中的多条entry信息进行修改,则RN-0在该段时间内可以不同步的产生多条TLBI,适应的,RN-0向MN不同步的发送该多条TLBI对应的DVM请求,为了保证多条DVM请求指示的多条entry信息无效操作在广播范围指示信息对应的物理核同步生效,RN-0可以生成数据同步隔离(data synchronization barrier,DSB)指令,指示MN收集该多条DVM请求对应的节点的Resp信息后,同步向RN-0发送全部节点的Resp信息,表示多条DVM请求对应的节点的内存一致性维护完成。In a possible implementation, RN-0 may modify multiple entry information in the memory page table within a period of time, then RN-0 may generate multiple TLBIs asynchronously within this period of time. If applicable, RN -0 Asynchronously sends the DVM requests corresponding to the multiple TLBIs to the MN. In order to ensure that the multiple entry information invalid operations indicated by multiple DVM requests are synchronously effective in the physical cores corresponding to the broadcast range indication information, RN-0 can generate data synchronization The isolation (data synchronization barrier, DSB) instruction instructs the MN to collect the Resp information of the nodes corresponding to the multiple DVM requests, and then synchronously sends the Resp information of all nodes to the RN-0, indicating the memory consistency of the nodes corresponding to the multiple DVM requests Maintenance is complete.
示例性的,图3示出了本申请实施例的方法所应用的第二种虚拟化系统架构示意图。在图3的虚拟化系统框架示意图中,广播范围信息在指示广播范围时,以物理核为粒度指示广播范围,且广播范围信息中指示的节点由多个不同的MN管理。Exemplarily, FIG. 3 shows a schematic diagram of a second virtualization system architecture to which the method of this embodiment of the present application is applied. In the schematic diagram of the virtualization system framework in FIG. 3 , when the broadcast range information indicates the broadcast range, the physical core is used as the granularity to indicate the broadcast range, and the nodes indicated in the broadcast range information are managed by multiple different MNs.
如图3所示,虚拟化系统包括RN-0,MN1、MN2、RN1-1至RN1-n,以及RN2-1至RN2-n。其中,MN1用于管理RN1-1至RN1-n,MN2用于管理RN2-1至RN2-n。As shown in FIG. 3, the virtualization system includes RN-0, MN1, MN2, RN1-1 to RN1-n, and RN2-1 to RN2-n. Among them, MN1 is used to manage RN1-1 to RN1-n, and MN2 is used to manage RN2-1 to RN2-n.
其中,RN-0对应于物理核#1,物理核#1中设置有TLB和广播范围寄存器,广播范围寄存器中预设有广播范围信息。Wherein, RN-0 corresponds to physical core #1, TLB and broadcast range register are set in physical core #1, and broadcast range information is preset in the broadcast range register.
下面以RN-0对共享页表的entry信息做修改,并发起维护各物理核的TLB中entry信息一致性为例,说明虚拟化系统中维护内存一致性的过程。The process of maintaining memory consistency in a virtualization system is described below by taking RN-0 modifying the entry information of the shared page table and initiating maintaining the consistency of entry information in the TLB of each physical core as an example.
从RN-0对共享页表的entry信息做修改到RN-0向MN1节点发送DVM请求的过程,可以参照图2对应的实施例中RN-0向MN节点发送DVM请求的描述,在此不再赘述。From the process of RN-0 modifying the entry information of the shared page table to the process of RN-0 sending DVM request to MN1 node, please refer to the description of RN-0 sending DVM request to MN node in the embodiment corresponding to FIG. Repeat.
与图2对应的实施例不同的是,图3对应的实施例中,广播范围信息中指示的节点由多个不同的MN管理,例如,广播范围信息中指示的物理核包括RN1-1、RN1-2、RN2-1以及RN2-2,RN1-1和RN1-2由MN1管理,RN2-1以及RN2-2由MN2管理。Different from the embodiment corresponding to FIG. 2 , in the embodiment corresponding to FIG. 3 , the nodes indicated in the broadcast range information are managed by multiple different MNs. For example, the physical cores indicated in the broadcast range information include RN1-1, RN1 -2, RN2-1 and RN2-2, RN1-1 and RN1-2 are managed by MN1, and RN2-1 and RN2-2 are managed by MN2.
如图3所示,RN-0将DVM请求发送到MN1,MN1解析广播范围信息,确定该广播范围信息中,不仅包括MN1对应的RN1-1和RN1-2,还包括MN2对应的RN2-1和RN2-2。As shown in Figure 3, RN-0 sends a DVM request to MN1, MN1 parses the broadcast range information, and determines that the broadcast range information includes not only RN1-1 and RN1-2 corresponding to MN1, but also RN2-1 corresponding to MN2 and RN2-2.
MN1向RN1-1和RN1-2广播snpDVM请求,同时,MN1将DVM请求发送到MN2,MN2解析广播范围信息,确定该广播范围信息中指示的物理核包括RN2-1和RN2-2,MN2向RN2-1和RN2-2发送snpDVM请求。MN1 broadcasts the snpDVM request to RN1-1 and RN1-2. At the same time, MN1 sends the DVM request to MN2. MN2 parses the broadcast range information and determines that the physical cores indicated in the broadcast range information include RN2-1 and RN2-2. RN2-1 and RN2-2 send snpDVM requests.
RN1-1和RN1-2可以参照图2对应的实施例的描述的方式进行TLB中的entry信息无效操作,以及向MN1发送Resp信息。RN1-1 and RN1-2 may perform the entry information invalidation operation in the TLB and send Resp information to MN1 in the manner described in the embodiment corresponding to FIG. 2 .
RN2-1和RN2-2可以参照图2对应的实施例的描述的方式进行TLB中的entry信息无效操作,以及向MN2发送Resp信息。RN2-1 and RN2-2 may perform the entry information invalidation operation in the TLB and send Resp information to MN2 in the manner described in the embodiment corresponding to FIG. 2 .
MN2收集RN2-1和RN2-2的Resp信息后,可以向MN1发送Resp信息,MN1收集RN1-1、RN1-2和MN2的Resp信息,并向RN-0发送Resp信息。After MN2 collects the Resp information of RN2-1 and RN2-2, it can send the Resp information to MN1, and MN1 collects the Resp information of RN1-1, RN1-2 and MN2, and sends the Resp information to RN-0.
可以理解的是,图2示出了广播范围信息指示的物理核对应有两个MN的情况, 实际应用中,MN的数量可能大于2,各MN可以通过互相传递DVM请求等方式,实现对广播范围信息中指示的各MN管理的RN进行广播和收集回复信息,本申请实施例对MN的数量以及多个MN之间的通信方式不作限定。It can be understood that FIG. 2 shows the situation that the physical core indicated by the broadcast range information corresponds to two MNs. In practical applications, the number of MNs may be greater than 2, and each MN can transmit DVM requests to each other. The RN managed by each MN indicated in the range information broadcasts and collects reply information. The embodiment of the present application does not limit the number of MNs and the communication mode between multiple MNs.
示例性的,图4示出了本申请实施例的方法所应用的第三种虚拟化系统架构示意图。在图4的虚拟化系统框架示意图中,广播范围信息在指示广播范围时,以层为粒度指示广播范围,且广播范围信息中指示的层中的节点均由一个MN管理。Exemplarily, FIG. 4 shows a schematic diagram of a third virtualization system architecture to which the method of this embodiment of the present application is applied. In the schematic diagram of the virtualization system framework in FIG. 4 , when the broadcast range information indicates the broadcast range, the broadcast range is indicated by layers as granularity, and the nodes in the layers indicated in the broadcast range information are all managed by one MN.
如图4所示,虚拟化系统包括RN-0,MN0、RN0-1至RN0-n。其中,RN0-1至RN0-n属于一个层0,MN0用于管理RN0-1至RN0-n。As shown in FIG. 4, the virtualization system includes RN-0, MN0, RN0-1 to RN0-n. Among them, RN0-1 to RN0-n belong to one layer 0, and MN0 is used to manage RN0-1 to RN0-n.
其中,RN-0对应于物理核#1,物理核#1中设置有TLB和广播范围寄存器,广播范围寄存器中预设有广播范围信息。Wherein, RN-0 corresponds to physical core #1, TLB and broadcast range register are set in physical core #1, and broadcast range information is preset in the broadcast range register.
下面以RN-0对共享页表的entry信息做修改,并发起维护各物理核的TLB中entry信息一致性为例,说明虚拟化系统中维护内存一致性的过程。The process of maintaining memory consistency in a virtualization system is described below by taking RN-0 modifying the entry information of the shared page table and initiating maintaining the consistency of entry information in the TLB of each physical core as an example.
从RN-0对共享页表的entry信息做修改到RN-0向MN1节点发送DVM请求的过程,可以参照图2对应的实施例中RN-0向MN节点发送DVM请求的描述,在此不再赘述。From the process of RN-0 modifying the entry information of the shared page table to the process of RN-0 sending DVM request to MN1 node, please refer to the description of RN-0 sending DVM request to MN node in the embodiment corresponding to FIG. Repeat.
与图2对应的实施例不同的是,图4对应的实施例中,广播范围信息在指示广播范围时,以层为粒度指示广播范围。例如,广播范围信息中指示层0,层0包括RN0-1至RN0-n。Different from the embodiment corresponding to FIG. 2 , in the embodiment corresponding to FIG. 4 , when the broadcast range information indicates the broadcast range, the broadcast range is indicated by the granularity of layers. For example, layer 0 is indicated in the broadcast range information, and layer 0 includes RNO-1 to RNO-n.
如图4所示,RN-0将DVM请求发送到MN0,MN0解析广播范围信息,确定该广播范围信息为层0,MN0向层0内的RN0-1至RN0-n发送snpDVM请求。As shown in FIG. 4 , RN-0 sends a DVM request to MN0, MN0 parses the broadcast range information, determines that the broadcast range information is layer 0, and MN0 sends snpDVM requests to RN0-1 to RN0-n in layer 0.
RN0-1至RN0-n可以参照图2对应的实施例的描述的方式进行TLB中的entry信息无效操作,以及向MN0发送Resp信息,MN0收集层0内的RN0-1至RN0-n的Resp信息,并向RN-0发送Resp信息。RNO-1 to RNO-n may perform the entry information invalidation operation in the TLB in the manner described in the corresponding embodiment of FIG. 2, and send Resp information to MNO, and MNO collects the Resp of RNO-1 to RNO-n in layer 0 information, and send Resp information to RN-0.
示例性的,图5示出了本申请实施例的方法所应用的第四种虚拟化系统架构示意图。在图5的虚拟化系统框架示意图中,广播范围信息在指示广播范围时,以层为粒度指示广播范围,且广播范围信息中指示的多个层由多个不同的MN管理。Exemplarily, FIG. 5 shows a schematic diagram of a fourth virtualization system architecture to which the method of this embodiment of the present application is applied. In the schematic diagram of the virtualization system framework in FIG. 5 , when the broadcast range information indicates the broadcast range, the broadcast range is indicated by layers as granularity, and multiple layers indicated in the broadcast range information are managed by multiple different MNs.
如图5所示,虚拟化系统包括RN-0,MN3、MN4、RN3-1至RN3-n,以及RN4-1至RN4-n。其中,RN3-1至RN3-n属于层3,RN4-1至RN4-n属于层4,MN3用于管理RN3-1至RN3-n,MN4用于管理RN4-1至RN4-n。As shown in FIG. 5, the virtualization system includes RN-0, MN3, MN4, RN3-1 to RN3-n, and RN4-1 to RN4-n. RN3-1 to RN3-n belong to layer 3, RN4-1 to RN4-n belong to layer 4, MN3 is used to manage RN3-1 to RN3-n, and MN4 is used to manage RN4-1 to RN4-n.
下面以RN-0对共享页表的entry信息做修改,并发起维护各物理核的TLB中entry信息一致性为例,说明虚拟化系统中维护内存一致性的过程。The process of maintaining memory consistency in a virtualization system is described below by taking RN-0 modifying the entry information of the shared page table and initiating maintaining the consistency of entry information in the TLB of each physical core as an example.
从RN-0对共享页表的entry信息做修改到RN-0向MN1节点发送DVM请求的过程,可以参照图2对应的实施例中RN-0向MN节点发送DVM请求的描述,在此不再赘述。From the process of RN-0 modifying the entry information of the shared page table to the process of RN-0 sending DVM request to MN1 node, please refer to the description of RN-0 sending DVM request to MN node in the embodiment corresponding to FIG. Repeat.
与图4对应的实施例不同的是,图5对应的实施例中,广播范围信息中指示的层的数量为多个,多个层由不同的MN管理。例如,广播范围信息中指示的层包括层3和层4,层3的节点由MN3管理,层4的节点由MN4管理。Different from the embodiment corresponding to FIG. 4 , in the embodiment corresponding to FIG. 5 , the number of layers indicated in the broadcast range information is multiple, and the multiple layers are managed by different MNs. For example, the layers indicated in the broadcast range information include layers 3 and 4, the nodes of layer 3 are managed by MN3, and the nodes of layer 4 are managed by MN4.
如图5所示,RN-0将DVM请求发送到MN3,MN3解析广播范围信息,确定该广播范围信息中,不仅包括MN3对应的层3内的RN3-1至RN3-n,还包括MN4对应 的层4内的RN4-1至RN4-n。As shown in Figure 5, RN-0 sends the DVM request to MN3, MN3 parses the broadcast range information, and determines that the broadcast range information includes not only RN3-1 to RN3-n in layer 3 corresponding to MN3, but also the corresponding information of MN4. RN4-1 to RN4-n within layer 4.
MN3向RN3-1至RN3-n广播snpDVM请求,以及,MN3将DVM请求发送到MN4,MN3解析广播范围信息,确定该广播范围信息为层3内的RN3-1至RN3-n,MN3向层3内RN3-1至RN3-n广播送snpDVM请求。MN3 broadcasts the snpDVM request to RN3-1 to RN3-n, and MN3 sends the DVM request to MN4, MN3 parses the broadcast range information, determines that the broadcast range information is RN3-1 to RN3-n in layer 3, and MN3 sends the layer In 3, RN3-1 to RN3-n broadcast the snpDVM request.
RN3-1至RN3-n可以参照图2对应的实施例的描述的方式进行TLB中的entry信息无效操作,以及向MN3发送Resp信息。RN3-1 to RN3-n may perform the entry information invalidation operation in the TLB and send Resp information to MN3 in the manner described in the embodiment corresponding to FIG. 2 .
RN4-1至RN4-n可以参照图2对应的实施例的描述的方式进行TLB中的entry信息无效操作,以及向MN4发送Resp信息。RN4-1 to RN4-n may perform the entry information invalidation operation in the TLB and send Resp information to MN4 in the manner described in the embodiment corresponding to FIG. 2 .
MN4收集RN4-1至RN4-n的Resp信息后,可以向MN3发送Resp信息,MN3收集RN3-1至RN3-n和MN4的Resp信息,并向RN-0发送Resp信息。After MN4 collects the Resp information of RN4-1 to RN4-n, it can send the Resp information to MN3, and MN3 collects the Resp information of RN3-1 to RN3-n and MN4, and sends the Resp information to RN-0.
可以理解的是,图5示出了广播范围信息指示的层对应有两个MN的情况,实际应用中,MN的数量可能大于2,各MN可以通过互相传递DVM请求等方式,实现对广播范围信息中指示的各MN对应层中的节点进行广播和收集回复信息,本申请实施例对MN的数量以及多个MN之间的通信方式不作限定。It can be understood that FIG. 5 shows the situation where there are two MNs corresponding to the layer indicated by the broadcast range information. In practical applications, the number of MNs may be greater than 2. Each MN can transmit DVM requests to each other to realize the broadcast range. Nodes in layers corresponding to each MN indicated in the information broadcast and collect reply information. The embodiment of this application does not limit the number of MNs and the communication mode between multiple MNs.
需要说明的是,以图2为例,MN自身的广播范围可能覆盖RN-1至RN-n等较多数量的节点,但是因为受到广播范围信息的限制,MN可以向RN-1广播snpDVM请求,而不向RN-2至RN-n发送snpDVM请求,则在维护页表一致性时,MN不需要等待RN-2至RN-n的回复,可以节约等待时间,提升维护效率,且能减少系统的运算量,节约计算资源。It should be noted that, taking Figure 2 as an example, the broadcast range of the MN itself may cover a large number of nodes such as RN-1 to RN-n, but due to the limitation of the broadcast range information, the MN can broadcast the snpDVM request to RN-1 , instead of sending snpDVM requests to RN-2 to RN-n, when maintaining page table consistency, MN does not need to wait for the replies from RN-2 to RN-n, which can save waiting time, improve maintenance efficiency, and reduce The amount of calculation of the system can save computing resources.
可以理解的是,图3-图5对应的实施例中,类似于图2的描述,也可以减少MN的等待时间,提升维护效率,且能减少系统的运算量,节约计算资源。It can be understood that, in the embodiments corresponding to FIGS. 3-5 , similar to the description in FIG. 2 , the waiting time of the MN can be reduced, the maintenance efficiency can be improved, the calculation amount of the system can be reduced, and computing resources can be saved.
需要说明的是,图2-图5对应的实施例中,TLBI也可以替换为缓存维护指令(instruction cache maintenance instruction,IC)等指令,IC指令用于实现TLBI的上述功能,在此不再赘述。图2-图5中,触发进行内存一致性维护的原因,也可以是RN-0对自身的TLB中的entry信息进行修改,例如,RN-0清空自身的TLB中的entry信息,则RN-0也可以请求虚拟化系统中相关物理核进行内存一致性维护,内存一致性维护的过程详见上述说明,在此不再赘述。It should be noted that, in the embodiments corresponding to FIG. 2 to FIG. 5, TLBI can also be replaced with instructions such as cache maintenance instruction (instruction cache maintenance instruction, IC), and the IC instruction is used to implement the above functions of TLBI, which will not be repeated here. . In Figure 2-Figure 5, the reason for triggering memory consistency maintenance can also be that RN-0 modifies the entry information in its own TLB. For example, if RN-0 clears the entry information in its own TLB, then RN-0 0 can also request the relevant physical cores in the virtualization system to perform memory consistency maintenance. The process of memory consistency maintenance is detailed in the above description, and will not be repeated here.
图2-图5对应的实施例中,MN向广播范围信息指示的节点广播snpDVM指令时,可以遮盖方式发送,其中,遮盖方式为将除广播范围信息指示之外的节点的物理核进行遮盖,使得除广播范围信息指示之外的节点的物理核接收不到snpDVM指令。In the embodiment corresponding to FIG. 2 to FIG. 5 , when the MN broadcasts the snpDVM instruction to the node indicated by the broadcast range information, it can be sent in a covering manner, wherein the covering method is to cover the physical cores of the nodes other than those indicated by the broadcast range information, The physical cores of nodes other than those indicated by the broadcast range information cannot receive the snpDVM instruction.
下面将结合附图对本申请实施例提供的通信方法做详细说明。图6示出了本申请提供的一种多核系统中的数据同步的流程示意图,如图6所示,本申请实施例的方法包括:The communication method provided by the embodiments of the present application will be described in detail below with reference to the accompanying drawings. FIG. 6 shows a schematic flowchart of data synchronization in a multi-core system provided by the present application. As shown in FIG. 6 , the method of the embodiment of the present application includes:
S601、请求节点生成指令。S601. Request the node to generate an instruction.
本申请实施例中,指令可以用于指示第一管理节点执行某种操作,可以是运行的最小功能单位。例如,指令可以是TLBI,也可以是IC等指令。In this embodiment of the present application, the instruction may be used to instruct the first management node to perform a certain operation, which may be the smallest functional unit of operation. For example, the instruction can be TLBI, or it can be an instruction such as IC.
示例性的,以请求节点为运行在物理核#1的RN-0为例,RN-0对共享页表中的entry信息做修改时,或者RN-0对自己的TLB中的entry信息进行修改时,RN-0可以生成指令,该指令可用于指示维护虚拟化系统的内存一致性。Exemplarily, taking the requesting node as RN-0 running on physical core #1 as an example, when RN-0 modifies the entry information in the shared page table, or RN-0 modifies the entry information in its own TLB , RN-0 can generate instructions that can be used to instruct the maintenance of memory coherency of the virtualized system.
S602、请求节点将指令和广播范围信息打包为DVM请求。S602. The requesting node packages the instruction and broadcast range information into a DVM request.
本申请实施例中,广播范围信息用于指示第一管理节点的广播范围,例如,广播范围信息可以用于指示第一管理节点需要广播的节点,等。In this embodiment of the present application, the broadcast range information is used to indicate the broadcast range of the first management node. For example, the broadcast range information may be used to indicate the nodes that the first management node needs to broadcast, and so on.
其中,广播范围信息可以设置在寄存器中,也可以设置在其他存储设备中,本申请实施例对此不作具体限定。The broadcast range information may be set in a register, or may be set in other storage devices, which is not specifically limited in this embodiment of the present application.
本申请实施例中,DVM请求用于请求第一管理节点维护物理核内的TLB的一致性。该DVM请求中携带广播范围信息。In this embodiment of the present application, the DVM request is used to request the first management node to maintain the consistency of the TLB in the physical core. The DVM request carries broadcast range information.
示例性的,运行在物理核#1的RN-0可以根据TLBI,将广播范围信息和TLBI打包为DVM请求。Exemplarily, the RN-0 running on the physical core #1 can package the broadcast range information and the TLBI into a DVM request according to the TLBI.
S603、请求节点向第一管理节点发送DVM请求。S603. The requesting node sends a DVM request to the first management node.
适应的,第一管理节点接收来自请求节点的DVM请求。Suitably, the first management node receives the DVM request from the requesting node.
本申请实施例中,请求节点向第一管理节点发送DVM请求的可能实现为:请求节点通过总线向第一管理节点发送DVM请求,适应的,第一管理节点可以通过总线接收来自请求节点的DVM请求。In the embodiment of the present application, the possible implementation of the requesting node sending the DVM request to the first management node is as follows: the requesting node sends the DVM request to the first management node through the bus, if applicable, the first management node can receive the DVM request from the requesting node through the bus. ask.
可以理解的是,本申请实施例中,总线协议中认可在DVM请求中包括广播范围信息,因此总线可以传输包括广播范围信息的DVM请求。It can be understood that, in the embodiment of the present application, it is recognized in the bus protocol that the DVM request includes the broadcast range information, so the bus can transmit the DVM request including the broadcast range information.
示例性的,以请求节点为运行在物理核#1的RN-0,第一管理节点为MN为例,在进行页表一致性维护时,RN-0通过CHI总线向MN发送DVM请求,MN通过CHI总线接收来自RN-0的DVM请求,该DVM请求中包括广播范围信息。Exemplarily, taking the requesting node as the RN-0 running on the physical core #1 and the first management node as the MN, when performing page table consistency maintenance, the RN-0 sends a DVM request to the MN through the CHI bus, and the MN A DVM request from RN-0 is received through the CHI bus, and the DVM request includes broadcast range information.
S604、第一管理节点解析广播范围信息,得到M个目标节点,M为正整数。S604. The first management node parses the broadcast range information to obtain M target nodes, where M is a positive integer.
本申请实施例中,目标节点指的是广播范围信息所指示的物理核所在的节点。In this embodiment of the present application, the target node refers to the node where the physical core indicated by the broadcast range information is located.
示例性的,以第一管理节点为MN为例,当MN接收DVM请求后,MN可以对广播范围信息进行解析,可以得到广播范围信息所指示的需要广播的M个节点,作为M个目标节点。Exemplarily, taking the first management node as the MN as an example, after the MN receives the DVM request, the MN can parse the broadcast range information, and can obtain M nodes that need to be broadcast indicated by the broadcast range information, as M target nodes. .
S605、第一管理节点向M个目标节点发送广播信息。S605. The first management node sends broadcast information to the M target nodes.
本申请实施例中,广播信息用于指示目标节点进行内存一致性维护,例如广播信息可以为上述实施例的snpDVM请求。In this embodiment of the present application, the broadcast information is used to instruct the target node to perform memory consistency maintenance, for example, the broadcast information may be the snpDVM request in the above embodiment.
其中,内存一致性维护可以指目标节点中的数据同第一管理节点中的数据保持一致。例如,RN-0对共享页表修改后,需要目标节点RN-1,目标节点RN-2节点,目标节点RN-3节点等均在各自的TLB中更新为修改后的共享页表。M个目标节点接收到来自第一管理节点的广播信息后,可以各自执行内存一致性维护。例如,任一个目标节点可以无效TLB当前存储的entry信息,再次访问可以从memory系统中获取更新后的共享页表,将更新后的共享页表存储在TLB中。The memory consistency maintenance may refer to keeping the data in the target node consistent with the data in the first management node. For example, after RN-0 modifies the shared page table, the target node RN-1, the target node RN-2, and the target node RN-3 need to be updated to the modified shared page table in their respective TLBs. After receiving the broadcast information from the first management node, the M target nodes can perform memory consistency maintenance respectively. For example, any target node can invalidate the entry information currently stored in the TLB, access the updated shared page table from the memory system again, and store the updated shared page table in the TLB.
S606、第一管理节点收集来自M个目标节点的M个响应信息。S606. The first management node collects M pieces of response information from the M target nodes.
本申请实施例中,响应信息用于表示目标节点完成数据同步。In this embodiment of the present application, the response information is used to indicate that the target node completes data synchronization.
示例性的,当目标节点RN-1完成数据同步后,RN-1向MN发送响应信息,适应的,MN收集来自RN-1的响应信息。类似的,第一管理节点可以收集来自目标节点RN-2节点的响应信息,直到MN收集完来自M个目标节点的响应信息。Exemplarily, after the target node RN-1 completes data synchronization, RN-1 sends response information to the MN, and adaptively, the MN collects the response information from RN-1. Similarly, the first management node may collect the response information from the target node RN-2 until the MN has collected the response information from the M target nodes.
可以理解的是,各目标节点进行数据同步的时间可能不同,因此,各目标节点向 第一管理节点发送响应信息的时机也可能不同,在此过程中,第一管理节点需要等待直到收集到M个节点的响应信息,则第一管理节点可以确认内存一致性维护完成。It can be understood that the time at which each target node performs data synchronization may be different. Therefore, the timing at which each target node sends response information to the first management node may also be different. During this process, the first management node needs to wait until the M is collected. The response information of each node, the first management node can confirm that the memory consistency maintenance is completed.
可能的实现方式中,第一管理节点还可以向第二管理节点转发DVM请求。例如,在图3或图5对应的场景中,第二管理节点可以是MN2节点或MN4节点,则第二管理节点可以从第一管理节点接收DVM请求,并根据DVM请求收集第二管理节点对应的目标节点。具体可以参照图3和图5的解释,在此不作赘述。In a possible implementation manner, the first management node may also forward the DVM request to the second management node. For example, in the scenario corresponding to FIG. 3 or FIG. 5, the second management node may be the MN2 node or the MN4 node, then the second management node may receive the DVM request from the first management node, and collect the corresponding data of the second management node according to the DVM request. the target node. For details, reference may be made to the explanations in FIG. 3 and FIG. 5 , which will not be repeated here.
S607、第一管理节点向请求节点发送用于指示数据同步完成的信息。S607: The first management node sends information for indicating completion of data synchronization to the requesting node.
适应的,请求节点接收来自第一管理节点的用于指示内存一致性维护完成的信息。Suitably, the requesting node receives information from the first management node indicating that the maintenance of memory consistency is completed.
本申请实施例中,用于指示内存一致性维护完成的信息可以是第一管理节点收集到M个目标节点的响应信息后,向请求节点发送的用于指示内存一致性维护完成的响应信息。用于指示内存一致性维护完成的信息可以是数字或字符等形式,本申请实施例对此不作具体限定。In this embodiment of the present application, the information used to indicate the completion of memory consistency maintenance may be response information sent by the first management node to the requesting node to indicate the completion of memory consistency maintenance after collecting response information from the M target nodes. The information used to indicate that the memory consistency maintenance is completed may be in the form of numbers or characters, which is not specifically limited in this embodiment of the present application.
请求节点可以接收到指示内存一致性维护完成的信息,确认内存一致性维护完成。The requesting node can receive the information indicating that the memory consistency maintenance is completed, and confirm that the memory consistency maintenance is completed.
综上,在本申请实施例中,请求节点向第一管理节点发送请求时,该请求中包括了广播范围信息,可以缩小第一管理节点的广播范围,后续请求节点向该广播范围信息对应的节点发送广播信息,可以减少请求节点等待广播范围信息对应的节点的无效操作的时间,提升维护内存一致性的效率,且广播范围信息对应的节点不需要中断任务,不影响广播范围信息对应的节点的性能。To sum up, in this embodiment of the present application, when the requesting node sends a request to the first management node, the request includes the broadcast range information, which can narrow the broadcast range of the first management node, and the subsequent request node sends a request to the corresponding broadcast range information. Sending broadcast information by a node can reduce the time that the requesting node waits for invalid operations of the node corresponding to the broadcast range information, improve the efficiency of maintaining memory consistency, and the node corresponding to the broadcast range information does not need to interrupt the task, and does not affect the node corresponding to the broadcast range information. performance.
在图6对应的实施例的基础上,一种可能的实现方式中,广播范围信息可以包括:请求节点中运行的虚拟机相关的物理核的标识,和/或,请求节点中运行的虚拟机相关的层的标识;其中,层对应多个物理核。On the basis of the embodiment corresponding to FIG. 6 , in a possible implementation manner, the broadcast range information may include: an identifier of a physical core related to a virtual machine running in the requesting node, and/or a virtual machine running in the requesting node The identification of the relevant layer; wherein, the layer corresponds to multiple physical cores.
本申请实施例中,请求节点中运行的虚拟机VM的数量可以为一个,也可以为多个。In this embodiment of the present application, the number of virtual machines running in the requesting node may be one or multiple.
VM相关的物理核的数量可以为一个,也可以为多个。物理核的标识用于明确标识该物理核,例如,物理核的标识可以为物理核的序号、地址或名称等,本申请实施例不作具体限定。The number of VM-related physical cores may be one or multiple. The identifier of the physical core is used to clearly identify the physical core. For example, the identifier of the physical core may be the serial number, address, or name of the physical core, which is not specifically limited in this embodiment of the present application.
VM相关的层的数量可以为一个,也可以为多个。层的标识用于明确标识该层,例如,层的标识可以为层的序号或名称等,本申请实施例不作具体限定。The number of VM-related layers may be one or multiple. The identifier of the layer is used to clearly identify the layer. For example, the identifier of the layer may be the serial number or name of the layer, which is not specifically limited in this embodiment of the present application.
广播范围信息中可以包括请求节点中运行的虚拟机相关的物理核的标识,或者,广播范围信息中可以包括请求节点中运行的虚拟机相关的层的标识,或者,广播范围信息中可以包括请求节点中运行的虚拟机相关的物理核的标识以及层的标识。则后续第一管理节点可以解析该广播范围信息,向该广播范围信息指示的物理核或层发送广播信息。The broadcast range information may include the identifier of the physical core related to the virtual machine running in the request node, or the broadcast range information may include the identifier of the layer related to the virtual machine running in the request node, or the broadcast range information may include the request The identification of the physical core and the identification of the layer related to the virtual machine running in the node. Then the subsequent first management node may parse the broadcast range information, and send the broadcast information to the physical core or layer indicated by the broadcast range information.
可选地,广播范围信息可以设置在请求节点的寄存器中。该寄存器可以是专设的用于存储广播范围信息的广播范围寄存器,也可以是请求节点中任意的寄存器,本申请实施例对此不作具体限定。Optionally, the broadcast range information may be set in a register of the requesting node. The register may be a specially designed broadcast range register for storing broadcast range information, or may be any register in the requesting node, which is not specifically limited in this embodiment of the present application.
示例性的,图7示出了一种MN解析DVM请求的逻辑架构示意图。如图7所示,MN700中可以设置寄存器701。Exemplarily, FIG. 7 shows a schematic diagram of a logical architecture of an MN parsing a DVM request. As shown in FIG. 7, a register 701 can be set in the MN700.
MN接收请求节点(sco物理核)的携带广播范围信息的DVM请求后,可以将总 线传递过来的广播范围信息与寄存器701中指示的该MN管理的节点进行比较,如果广播范围信息与该MN对应的节点匹配,则向匹配的节点转发广播信息。其中,该广播信息例如可以为snpDVM请求,该snpDVM请求中包括广播范围信息对应的DVM编码。After the MN receives the DVM request carrying the broadcast range information from the requesting node (sco physical core), it can compare the broadcast range information delivered by the bus with the node managed by the MN indicated in the register 701. If the broadcast range information corresponds to the MN If the node matches, the broadcast information will be forwarded to the matching node. The broadcast information may be, for example, a snpDVM request, and the snpDVM request includes a DVM code corresponding to the broadcast range information.
另一种可能的实现中,MN内可以设置物理核的位映射(bitmap),每一个bitmap指示一个对应的RN。In another possible implementation, bitmaps (bitmaps) of physical cores may be set in the MN, and each bitmap indicates a corresponding RN.
MN接收请求节点的携带广播范围信息的DVM请求后,根据bitmap信息,向bitmap信息指定的RN转发广播信息,该广播信息例如可以为snpDVM请求。After receiving the DVM request carrying the broadcast range information from the requesting node, the MN forwards the broadcast information to the RN specified by the bitmap information according to the bitmap information, for example, the broadcast information may be the snpDVM request.
在图6对应的实施例的基础上,一种可能的实现方式中,S605包括:第一管理节点采用遮盖方式向M个目标节点发送广播信息,其中,遮盖方式为将除M个目标节点外的节点进行遮盖的方式。On the basis of the embodiment corresponding to FIG. 6 , in a possible implementation manner, S605 includes: the first management node sends broadcast information to M target nodes in a covering manner, wherein the covering manner is The way the node is covered.
其中,遮盖方式可以指第一管理节点在发送广播请求时,将没有用到的物理核遮住。The covering mode may refer to covering the unused physical cores when the first management node sends the broadcast request.
本申请实施例中,以第一管理节点为MN,MN对应的节点包括节点1-节点m,目标节点包括节点1、节点5和节点6为例,MN可以将节点1-节点m中,除节点1、节点5和节点6外的其他的节点的物理核遮住,并发送广播信息,因为该其他的节点的物理核被遮住,则该其他的节点不会收到广播信息,MN不需要等待该其他的节点关于进行页表一致性维护的回复,因此可以节约MN的等待时间,并减少系统的运算量。In the embodiment of the present application, taking the first management node as the MN, the nodes corresponding to the MN include node 1-node m, and the target node includes node 1, node 5, and node 6 as an example, the MN may divide the node 1-node m, except The physical cores of other nodes except node 1, node 5 and node 6 are blocked, and broadcast information is sent. Because the physical cores of the other nodes are blocked, the other nodes will not receive the broadcast information, and the MN will not receive the broadcast information. It is necessary to wait for the reply of the other node about maintaining the page table consistency, so the waiting time of the MN can be saved, and the calculation amount of the system can be reduced.
示例性的,图8为本申请提供的一种具体的多核系统中的数据同步方法的流程示意图,如图8所示,本申请实施例的方法包括:Exemplarily, FIG. 8 is a schematic flowchart of a specific data synchronization method in a multi-core system provided by the present application. As shown in FIG. 8 , the method of the embodiment of the present application includes:
S801、软件维护广播范围寄存器。S801, the software maintains the broadcast range register.
本申请实施例中,软件指的是一系列按照特定顺序组织的数据和指令的集合。例如,软件可以是CPU调度程序软件。In this embodiment of the present application, software refers to a series of sets of data and instructions organized in a specific order. For example, the software may be CPU scheduler software.
示例性的,用户可以通过CPU调度程序软件调度VM运行的物理核#N,并将VM的运行的相关物理核(即广播范围信息)通过自定义指令集架构(instruction set architecture,ISA)指令集更新到物理核#N广播范围寄存器中,使得广播范围寄存器中包括广播范围信息。Exemplarily, the user can schedule the physical core #N running the VM through the CPU scheduler software, and pass the relevant physical core (that is, the broadcast range information) of the running of the VM through the custom instruction set architecture (instruction set architecture, ISA) instruction set. The broadcast range register of physical core #N is updated, so that the broadcast range register includes the broadcast range information.
S802、请求节点生成DVM请求,并通过总线发送该DVM请求。S802. The requesting node generates a DVM request, and sends the DVM request through the bus.
示例性的,请求节点修改共享页表,或理解为VM软件更新翻译,请求节点中可以生成TLBI,请求节点将TLBI与广播范围信息打包为DVM请求,并基于总线payload向MN发送该DVM请求。其中,该广播范围信息也可以理解为自定义广播范围域。Exemplarily, the requesting node modifies the shared page table, or it is understood as the VM software update translation, the requesting node may generate a TLBI, the requesting node packages the TLBI and the broadcast range information into a DVM request, and sends the DVM request to the MN based on the bus payload. The broadcast range information may also be understood as a custom broadcast range domain.
S803、MN定向转发广播。S803, the MN directionally forwards the broadcast.
示例性的,MN收到DVM请求后,解析DVM请求中的广播范围信息,MN通过遮盖的方式定向地向广播范围信息所指示的RN发送广播信息。Exemplarily, after receiving the DVM request, the MN parses the broadcast range information in the DVM request, and the MN sends the broadcast information to the RN indicated by the broadcast range information in a targeted manner by covering.
S804、MN定向收集Resp信息。S804, the MN collects Resp information in a targeted manner.
示例性的,MN定向收集广播范围信息所指示的RN的Resp信息,并在收集完广播范围信息所指示的RN的Resp信息后,向请求节点发送Resp信息。Exemplarily, the MN directionally collects the Resp information of the RN indicated by the broadcast range information, and sends the Resp information to the requesting node after collecting the Resp information of the RN indicated by the broadcast range information.
在本申请实施例中,CPU调度程序软件维护广播范围寄存器,可以自定义的更新 广播范围,后续MN向通过遮盖的方式向对应的节点发送广播信息,可以减少MN等待对应的节点的无效操作的时间,降低软件维护成本和MN进行广播的实现难度,提升维护一致性的效率。In the embodiment of the present application, the CPU scheduler software maintains the broadcast range register, and can update the broadcast range by self-definition. Subsequently, the MN sends broadcast information to the corresponding node by covering it, which can reduce the MN waiting for the invalid operation of the corresponding node. Time, reduce software maintenance costs and the difficulty of MN broadcasting, and improve the efficiency of maintenance consistency.
可以理解的是,本申请实施例的数据同步方法也可以方便地移植到ARM架构的其他物理核中实现,不但不会增加MN的设计难度,而且还转移了维护范围功能到软件中,可以降低硬件负担。It can be understood that the data synchronization method of the embodiment of the present application can also be easily transplanted to other physical cores of the ARM architecture for implementation, which not only does not increase the design difficulty of the MN, but also transfers the maintenance scope function to the software, which can reduce the cost of the MN. Hardware burden.
本申请实施例还提供了一种计算机可读存储介质。上述实施例中描述的方法可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。如果在软件中实现,则功能可以作为一个或多个指令或代码存储在计算机可读介质上或者在计算机可读介质上传输。计算机可读介质可以包括计算机存储介质和通信介质,还可以包括任何可以将计算机程序从一个地方传送到另一个地方的介质。存储介质可以是可由计算机访问的任何目标介质。Embodiments of the present application also provide a computer-readable storage medium. The methods described in the above embodiments may be implemented in whole or in part by software, hardware, firmware or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media can include both computer storage media and communication media and also include any medium that can transfer a computer program from one place to another. The storage medium can be any target medium that can be accessed by a computer.
作为一种可能的设计,计算机可读介质可以包括RAM,ROM,EEPROM,CD-ROM或其它光盘存储器,磁盘存储器或其它磁存储设备,或目标于承载的任何其它介质或以指令或数据结构的形式存储所需的程序代码,并且可由计算机访问。而且,任何连接被适当地称为计算机可读介质。例如,如果使用同轴电缆,光纤电缆,双绞线,数字用户线(DSL)或无线技术(如红外,无线电和微波)从网站,服务器或其它远程源传输软件,则同轴电缆,光纤电缆,双绞线,DSL或诸如红外,无线电和微波之类的无线技术包括在介质的定义中。如本文所使用的磁盘和光盘包括光盘(CD),激光盘,光盘,数字通用光盘(DVD),软盘和蓝光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光光学地再现数据。上述的组合也应包括在计算机可读介质的范围内。As one possible design, the computer readable medium may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium intended to carry or in an instruction or data structure The required program code is stored in the form and can be accessed by the computer. Also, any connection is properly termed a computer-readable medium. For example, if you use coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies (such as infrared, radio, and microwave) to transmit software from a website, server, or other remote source, coaxial cable, fiber optic cable , twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of medium. Disk and disc as used herein includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
本申请实施例还提供了一种计算机程序产品。上述实施例中描述的方法可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。如果在软件中实现,可以全部或者部分得通过计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行上述计算机程序指令时,全部或部分地产生按照上述方法实施例中描述的流程或功能。上述计算机可以是通用计算机、专用计算机、计算机网络、基站、终端或者其它可编程装置。The embodiments of the present application also provide a computer program product. The methods described in the above embodiments may be implemented in whole or in part by software, hardware, firmware or any combination thereof. If implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the above-mentioned computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the above-mentioned method embodiments are generated. The aforementioned computer may be a general purpose computer, a special purpose computer, a computer network, a base station, a terminal, or other programmable devices.
以上的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。The above specific embodiments further describe the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above are only specific embodiments of the present invention, and are not intended to limit the protection scope of the present invention. On the basis of the technical solutions of the present invention, any modifications, equivalent replacements, improvements, etc. made shall be included within the protection scope of the present invention.

Claims (30)

  1. 一种虚拟化系统,其特征在于,所述虚拟化系统包括请求节点以及第一管理节点;A virtualization system, characterized in that the virtualization system includes a request node and a first management node;
    所述请求节点,用于在进行所述虚拟化系统的内存一致性维护时,向所述第一管理节点发送分布式虚拟内存DVM请求,所述DVM请求包括广播范围信息;the requesting node, configured to send a distributed virtual memory DVM request to the first management node when performing memory consistency maintenance of the virtualization system, where the DVM request includes broadcast range information;
    所述第一管理节点,用于解析所述DVM请求以获取所述广播范围信息,所述广播范围信息指示M个目标节点的信息;所述M为正整数;the first management node, configured to parse the DVM request to obtain the broadcast range information, where the broadcast range information indicates information of M target nodes; the M is a positive integer;
    所述第一管理节点,还用于向所述M个目标节点中每个目标节点发送广播信息;所述广播信息用于指示每个目标节点进行内存一致性维护。The first management node is further configured to send broadcast information to each of the M target nodes; the broadcast information is used to instruct each target node to perform memory consistency maintenance.
  2. 根据权利要求1所述的虚拟化系统,其特征在于,所述广播范围信息包括:所述请求节点中运行的虚拟机相关的物理核的标识,和/或,所述请求节点中运行的虚拟机相关的层的标识;其中,所述层对应多个物理核。The virtualization system according to claim 1, wherein the broadcast range information includes: an identifier of a physical core related to a virtual machine running in the requesting node, and/or a virtual machine running in the requesting node machine-related layer identifiers; wherein, the layers correspond to multiple physical cores.
  3. 根据权利要求2所述的虚拟化系统,其特征在于,The virtualization system according to claim 2, wherein,
    所述第一管理节点,具体用于在解析到所述广播范围信息指示的为M个物理核的标识时,向所述M个目标节点中每个目标节点发送所述广播信息;The first management node is specifically configured to send the broadcast information to each of the M target nodes when the identifiers of the M physical cores indicated by the broadcast range information are parsed;
    或者,所述第一管理节点,具体用于在解析到所述广播范围信息指示的为层的标识时,向所述层中的M个目标节点中每个目标节点发送所述广播信息。Or, the first management node is specifically configured to send the broadcast information to each of the M target nodes in the layer when the identifier of the layer indicated by the broadcast range information is parsed.
  4. 根据权利要求2或3任一项所述的虚拟化系统,其特征在于,所述广播范围信息指示的节点中包括属于第二管理节点管理的节点;The virtualization system according to any one of claims 2 or 3, wherein the nodes indicated by the broadcast range information include nodes managed by the second management node;
    所述第一管理节点,还用于向所述第二管理节点发送所述DVM请求;the first management node, further configured to send the DVM request to the second management node;
    所述第二管理节点,用于解析所述DVM请求以获取所述广播范围信息,并向所述属于所述第二管理节点管理的节点发送所述广播信息。The second management node is configured to parse the DVM request to obtain the broadcast range information, and send the broadcast information to the node managed by the second management node.
  5. 根据权利要求2或3任一项所述的虚拟化系统,其特征在于,所述广播范围信息指示的层中包括属于第二管理节点管理的层;The virtualization system according to any one of claims 2 or 3, wherein the layer indicated by the broadcast range information includes a layer managed by the second management node;
    所述第一管理节点,还用于向所述第二管理节点发送所述DVM请求;the first management node, further configured to send the DVM request to the second management node;
    所述第二管理节点,用于解析所述DVM请求以获取所述广播范围信息,并向所述属于所述第二管理节点管理的层中的节点发送所述广播信息。The second management node is configured to parse the DVM request to obtain the broadcast range information, and send the broadcast information to the nodes in the layer managed by the second management node.
  6. 根据权利要求1-5任一项所述的虚拟化系统,其特征在于,所述请求节点中设置有寄存器;所述寄存器用于存储所述广播范围信息;The virtualization system according to any one of claims 1-5, wherein the request node is provided with a register; the register is used to store the broadcast range information;
    所述请求节点,具体用于在进行所述虚拟化系统的内存一致性维护时,生成指令,将所述指令和所述广播范围信息打包为所述DVM请求,以及向所述第一管理节点发送所述DVM请求。The requesting node is specifically configured to generate an instruction when performing memory consistency maintenance of the virtualization system, package the instruction and the broadcast range information into the DVM request, and send the request to the first management node. Send the DVM request.
  7. 根据权利要求6所述的虚拟化系统,其特征在于,所述指令包括转译后备缓冲器指令TLBI或缓存维护指令IC指令。The virtualization system according to claim 6, wherein the instruction comprises a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction.
  8. 根据权利要求1-7任一项所述的虚拟化系统,其特征在于,The virtualization system according to any one of claims 1-7, wherein,
    所述第一管理节点,还用于收集来自所述M个目标节点的M个响应信息;各所述目标节点的响应信息用于表示所述目标节点完成内存一致性维护;The first management node is further configured to collect M response information from the M target nodes; the response information of each target node is used to indicate that the target node has completed memory consistency maintenance;
    所述第一管理节点,还用于向所述请求节点发送用于指示内存一致性维护完成的信息。The first management node is further configured to send, to the requesting node, information indicating completion of memory consistency maintenance.
  9. 根据权利要求8所述的虚拟化系统,其特征在于,所述请求节点,还用于在预设时间内发出多条DVM请求时,生成数据同步隔离DSB指令,所述DSB指令用于指示所述第一管理节点在收集完成所述多条DVM请求对应的节点的响应信息后,同步向所述第一管理节点发送用于指示内存一致性维护完成的信息。The virtualization system according to claim 8, wherein the requesting node is further configured to generate a data synchronization isolation DSB instruction when multiple DVM requests are issued within a preset time, and the DSB instruction is used to instruct the After collecting and completing the response information of the nodes corresponding to the multiple DVM requests, the first management node synchronously sends the information for indicating that the memory consistency maintenance is completed to the first management node.
  10. 根据权利要求1-9任一项所述的虚拟化系统,其特征在于,The virtualization system according to any one of claims 1-9, wherein,
    所述第一管理节点,具体用于采用遮盖方式向所述M个目标节点中每个目标节点发送广播信息;其中,所述遮盖方式为将除所述M个目标节点外的节点进行遮盖的方式。The first management node is specifically configured to send broadcast information to each of the M target nodes in a covering manner; wherein the covering method is to cover nodes other than the M target nodes. Way.
  11. 一种虚拟化系统中内存一致性维护方法,其特征在于,所述方法包括:A method for maintaining memory consistency in a virtualization system, characterized in that the method includes:
    请求节点在进行所述虚拟化系统的内存一致性维护时,向第一管理节点发送分布式虚拟内存DVM请求,所述DVM请求包括广播范围信息;When performing memory consistency maintenance of the virtualization system, the requesting node sends a distributed virtual memory DVM request to the first management node, where the DVM request includes broadcast range information;
    所述第一管理节点解析所述DVM请求以获取所述广播范围信息,所述广播范围信息指示M个目标节点的信息;所述M为正整数;The first management node parses the DVM request to obtain the broadcast range information, where the broadcast range information indicates information of M target nodes; the M is a positive integer;
    所述第一管理节点向所述M个目标节点中每个目标节点发送广播信息;所述广播信息用于指示每个目标节点进行内存一致性维护。The first management node sends broadcast information to each of the M target nodes; the broadcast information is used to instruct each target node to perform memory consistency maintenance.
  12. 根据权利要求11所述的方法,其特征在于,所述广播范围信息包括:所述请求节点中运行的虚拟机相关的物理核的标识,和/或,所述请求节点中运行的虚拟机相关的层的标识;其中,所述层对应多个物理核。The method according to claim 11, wherein the broadcast range information comprises: an identifier of a physical core related to a virtual machine running in the requesting node, and/or a related virtual machine running in the requesting node The identifier of the layer; wherein, the layer corresponds to multiple physical cores.
  13. 根据权利要求12所述的方法,其特征在于,所述第一管理节点向所述M个目标节点中每个目标节点发送广播信息,包括:The method according to claim 12, wherein the first management node sends broadcast information to each of the M target nodes, comprising:
    所述第一管理节点在解析到所述广播范围信息指示的为M个物理核的标识时,向所述M个目标节点中每个目标节点发送所述广播信息;The first management node sends the broadcast information to each of the M target nodes when it parses the identifiers of the M physical cores indicated by the broadcast range information;
    或者,所述第一管理节点在解析到所述广播范围信息指示的为层的标识时,向所述层中的M个目标节点中每个目标节点发送所述广播信息。Alternatively, the first management node sends the broadcast information to each of the M target nodes in the layer when it parses that the broadcast range information indicates an identifier of a layer.
  14. 根据权利要求12或13所述的方法,其特征在于,所述广播范围信息指示的节点中包括属于第二管理节点管理的节点,和/或,属于所述第二管理节点管理的层,所述方法还包括:The method according to claim 12 or 13, wherein the nodes indicated by the broadcast range information include a node managed by a second management node, and/or a layer managed by the second management node, so The method also includes:
    所述第一管理节点向所述第二管理节点发送所述DVM请求。The first management node sends the DVM request to the second management node.
  15. 根据权利要求11-14任一项所述的方法,其特征在于,所述请求节点在进行所述虚拟化系统的内存一致性维护时,向第一管理节点发送分布式虚拟内存DVM请求,包括:The method according to any one of claims 11-14, wherein the requesting node sends a distributed virtual memory DVM request to the first management node when performing memory consistency maintenance of the virtualization system, comprising: :
    所述请求节点在进行所述虚拟化系统的内存一致性维护时,生成指令;The requesting node generates an instruction when performing the memory consistency maintenance of the virtualization system;
    所述请求节点将所述指令和所述广播范围信息打包为所述DVM请求;the requesting node packages the instruction and the broadcast range information into the DVM request;
    所述请求节点向所述第一管理节点发送所述DVM请求。The requesting node sends the DVM request to the first management node.
  16. 根据权利要求15所述的方法,其特征在于,所述指令包括转译后备缓冲器指令TLBI或缓存维护指令IC指令。16. The method of claim 15, wherein the instruction comprises a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction.
  17. 根据权利要求11-16任一项所述的方法,其特征在于,还包括:The method according to any one of claims 11-16, further comprising:
    所述第一管理节点收集来自所述M个目标节点的M个响应信息;各所述目标节点的响应信息用于表示所述目标节点完成内存一致性维护;The first management node collects M response information from the M target nodes; the response information of each target node is used to indicate that the target node completes memory consistency maintenance;
    所述第一管理节点向所述请求节点发送用于指示内存一致性维护完成的信息。The first management node sends information for indicating completion of memory consistency maintenance to the requesting node.
  18. 根据权利要求17所述的方法,其特征在于,还包括:The method of claim 17, further comprising:
    所述请求节点在预设时间内发出多条DVM请求时,生成数据同步隔离DSB指令,所述DSB指令用于指示所述第一管理节点在收集完成所述多条DVM请求对应的节点的响应信息后,同步向所述第一管理节点发送用于指示内存一致性维护完成的信息。When the requesting node sends out multiple DVM requests within a preset time, a data synchronization isolation DSB instruction is generated, and the DSB instruction is used to instruct the first management node to collect and complete the responses of the nodes corresponding to the multiple DVM requests. After the information is received, synchronously sends the information indicating that the memory consistency maintenance is completed to the first management node.
  19. 根据权利要求11-18任一项所述的方法,其特征在于,所述第一管理节点向所述M个目标节点中每个目标节点发送广播信息,包括:The method according to any one of claims 11-18, wherein the first management node sends broadcast information to each of the M target nodes, comprising:
    所述第一管理节点采用遮盖方式向所述M个目标节点中每个目标节点发送广播信息;其中,所述遮盖方式为将除所述M个目标节点外的节点进行遮盖的方式。The first management node sends broadcast information to each of the M target nodes in a covering mode, wherein the covering mode is a mode of covering nodes other than the M target nodes.
  20. 一种虚拟化系统中内存一致性维护方法,其特征在于,所述方法包括:A method for maintaining memory consistency in a virtualization system, characterized in that the method includes:
    第一管理节点接收来自请求节点的分布式虚拟内存DVM请求,所述DVM请求包括广播范围信息;The first management node receives a distributed virtual memory DVM request from the requesting node, where the DVM request includes broadcast range information;
    所述第一管理节点解析所述DVM请求以获取所述广播范围信息,所述广播范围信息指示M个目标节点的信息;所述M为正整数;The first management node parses the DVM request to obtain the broadcast range information, where the broadcast range information indicates information of M target nodes; the M is a positive integer;
    所述第一管理节点向所述M个目标节点中每个目标节点发送广播信息;所述广播信息用于指示每个目标节点进行内存一致性维护。The first management node sends broadcast information to each of the M target nodes; the broadcast information is used to instruct each target node to perform memory consistency maintenance.
  21. 根据权利要求20所述的方法,其特征在于,所述广播范围信息包括:所述请求节点中运行的虚拟机相关的物理核的标识,和/或,所述请求节点中运行的虚拟机相关的层的标识;其中,所述层对应多个物理核。The method according to claim 20, wherein the broadcast range information comprises: an identifier of a physical core related to a virtual machine running in the requesting node, and/or a related virtual machine running in the requesting node The identifier of the layer; wherein, the layer corresponds to multiple physical cores.
  22. 根据权利要求21所述的方法,其特征在于,所述第一管理节点向所述M个目标节点中每个目标节点发送广播信息,包括:The method according to claim 21, wherein the first management node sends broadcast information to each of the M target nodes, comprising:
    所述第一管理节点在解析到所述广播范围信息指示的为M个物理核的标识时,向所述M个目标节点中每个目标节点发送所述广播信息;The first management node sends the broadcast information to each of the M target nodes when it parses the identifiers of the M physical cores indicated by the broadcast range information;
    或者,所述第一管理节点在解析到所述广播范围信息指示的为层的标识时,向所述层中的M个目标节点中每个目标节点发送所述广播信息。Alternatively, the first management node sends the broadcast information to each of the M target nodes in the layer when it parses that the broadcast range information indicates an identifier of a layer.
  23. 根据权利要求21或22所述的方法,其特征在于,所述广播范围信息指示的节点中包括属于第二管理节点管理的节点,和/或,属于所述第二管理节点管理的层,所述方法还包括:The method according to claim 21 or 22, wherein the nodes indicated by the broadcast range information include a node managed by a second management node, and/or a layer managed by the second management node, so The method also includes:
    所述第一管理节点向所述第二管理节点发送所述DVM请求。The first management node sends the DVM request to the second management node.
  24. 根据权利要求20-23任一项所述的方法,其特征在于,还包括:The method according to any one of claims 20-23, further comprising:
    所述第一管理节点收集来自所述M个目标节点的M个响应信息;各所述目标节点的响应信息用于表示所述目标节点完成内存一致性维护;The first management node collects M response information from the M target nodes; the response information of each target node is used to indicate that the target node completes memory consistency maintenance;
    所述第一管理节点向所述请求节点发送用于指示内存一致性维护完成的信息。The first management node sends information for indicating completion of memory consistency maintenance to the requesting node.
  25. 根据权利要求20-24任一项所述的方法,其特征在于,所述第一管理节点向所述M个目标节点中每个目标节点发送广播信息,包括:The method according to any one of claims 20-24, wherein the first management node sends broadcast information to each of the M target nodes, comprising:
    所述第一管理节点采用遮盖方式向所述M个目标节点中每个目标节点发送广播信息;其中,所述遮盖方式为将除所述M个目标节点外的节点进行遮盖的方式。The first management node sends broadcast information to each of the M target nodes in a covering mode, wherein the covering mode is a mode of covering nodes other than the M target nodes.
  26. 一种虚拟化系统中内存一致性维护方法,其特征在于,所述方法包括:A method for maintaining memory consistency in a virtualization system, characterized in that the method includes:
    请求节点在进行所述虚拟化系统的内存一致性维护时,生成指令;The requesting node generates an instruction when performing the memory consistency maintenance of the virtualization system;
    所述请求节点将所述指令和设置在所述请求节点中的广播范围信息打包为所述DVM请求;The requesting node packages the instruction and the broadcast range information set in the requesting node into the DVM request;
    所述请求节点向所述第一管理节点发送所述DVM请求。The requesting node sends the DVM request to the first management node.
  27. 根据权利要求26所述的方法,其特征在于,所述广播范围信息包括:所述请求节点中运行的虚拟机相关的物理核的标识,和/或,所述请求节点中运行的虚拟机相关的层的标识;其中,所述层对应多个物理核。The method according to claim 26, wherein the broadcast range information comprises: an identifier of a physical core related to a virtual machine running in the requesting node, and/or a related virtual machine running in the requesting node The identifier of the layer; wherein, the layer corresponds to multiple physical cores.
  28. 根据权利要求26或27所述的方法,其特征在于,所述指令包括转译后备缓冲器指令TLBI或缓存维护指令IC指令。The method of claim 26 or 27, wherein the instruction comprises a translation lookaside buffer instruction TLBI or a cache maintenance instruction IC instruction.
  29. 根据权利要求26-28任一项所述的方法,其特征在于,The method according to any one of claims 26-28, wherein,
    所述请求节点接收来自所述第一管理节点的用于指示内存一致性维护完成的信息。The requesting node receives information from the first management node indicating that the maintenance of memory consistency is completed.
  30. 根据权利要求29所述的方法,其特征在于,还包括:The method of claim 29, further comprising:
    所述请求节点在预设时间内发出多条DVM请求时,生成数据同步隔离DSB指令,所述DSB指令用于指示所述第一管理节点在收集完成所述多条DVM请求对应的节点的响应信息后,同步向所述第一管理节点发送用于指示内存一致性维护完成的信息。When the requesting node sends out multiple DVM requests within a preset time, a data synchronization isolation DSB instruction is generated, and the DSB instruction is used to instruct the first management node to collect and complete the responses of the nodes corresponding to the multiple DVM requests. After the information is received, synchronously sends the information indicating that the memory consistency maintenance is completed to the first management node.
PCT/CN2021/091774 2021-04-30 2021-04-30 Virtualization system and method for maintaining memory consistency in virtualization system WO2022227093A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180090441.3A CN116830093A (en) 2021-04-30 2021-04-30 Virtualized system and memory consistency maintenance method in virtualized system
PCT/CN2021/091774 WO2022227093A1 (en) 2021-04-30 2021-04-30 Virtualization system and method for maintaining memory consistency in virtualization system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/091774 WO2022227093A1 (en) 2021-04-30 2021-04-30 Virtualization system and method for maintaining memory consistency in virtualization system

Publications (1)

Publication Number Publication Date
WO2022227093A1 true WO2022227093A1 (en) 2022-11-03

Family

ID=83847592

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/091774 WO2022227093A1 (en) 2021-04-30 2021-04-30 Virtualization system and method for maintaining memory consistency in virtualization system

Country Status (2)

Country Link
CN (1) CN116830093A (en)
WO (1) WO2022227093A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286335A1 (en) * 2016-04-04 2017-10-05 Qualcomm Incorporated Interconnect Distributed Virtual Memory Message Preemptive Responding
US20180232320A1 (en) * 2016-07-29 2018-08-16 Advanced Micro Devices, Inc. Controlling Access by IO Devices to Pages in a Memory in a Computing Device
CN108780350A (en) * 2016-03-31 2018-11-09 高通股份有限公司 The power of hardware management for memory management unit and distributed virtual storage network collapses and clock wakes up
CN112540938A (en) * 2019-09-20 2021-03-23 阿里巴巴集团控股有限公司 Processor core, processor, apparatus and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108780350A (en) * 2016-03-31 2018-11-09 高通股份有限公司 The power of hardware management for memory management unit and distributed virtual storage network collapses and clock wakes up
US20170286335A1 (en) * 2016-04-04 2017-10-05 Qualcomm Incorporated Interconnect Distributed Virtual Memory Message Preemptive Responding
US20180232320A1 (en) * 2016-07-29 2018-08-16 Advanced Micro Devices, Inc. Controlling Access by IO Devices to Pages in a Memory in a Computing Device
CN112540938A (en) * 2019-09-20 2021-03-23 阿里巴巴集团控股有限公司 Processor core, processor, apparatus and method

Also Published As

Publication number Publication date
CN116830093A (en) 2023-09-29

Similar Documents

Publication Publication Date Title
US9559940B2 (en) Take-over of network frame handling in a computing environment
US11249927B2 (en) Directed interrupt virtualization
EP2290539B1 (en) Computer configuration virtual topology discovery
US9383997B2 (en) Apparatus, system, and method for persistent user-level thread
US11182197B2 (en) Guest-initiated announcement of virtual machine migration
CN101625664B (en) Satisfying memory ordering requirements between partial writes and non-snoop accesses
US20210055945A1 (en) Interrupt signaling for directed interrupt virtualization
CN103368848A (en) Information processing apparatus, arithmetic device, and information transferring method
AU2020222167B2 (en) Directed interrupt for multilevel virtualization
JP2015191657A (en) Inter-architecture compatibility module for allowing code module of one architecture to use library module of another architecture
JP2013041409A (en) Information processing apparatus, interruption control method and interruption control program
CN113412472A (en) Directed interrupt virtualization with interrupt table
CN101635679B (en) Dynamic update of route table
EP3924819A1 (en) Directed interrupt for multilevel virtualization with interrupt table
CN110119304B (en) Interrupt processing method and device and server
TW201732579A (en) System, method, and apparatuses for remote monitoring
CN103455371A (en) Mechanism for optimized intra-die inter-nodelet messaging communication
US20090199191A1 (en) Notification to Task of Completion of GSM Operations by Initiator Node
US8146094B2 (en) Guaranteeing delivery of multi-packet GSM messages
WO2022227093A1 (en) Virtualization system and method for maintaining memory consistency in virtualization system
US10664407B2 (en) Dual first and second pointer for memory mapped interface communication with lower indicating process
US8972635B2 (en) Processor and information processing apparatus
CN114860551A (en) Method, device and equipment for determining instruction execution state and multi-core processor
CN117435529A (en) Direct Memory Access (DMA) transmission method and electronic equipment
CN116166468A (en) Method for processing ECC errors in heterogeneous system, heterogeneous system and related products thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21938566

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180090441.3

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE