WO2014206105A1 - 虚拟交换方法、相关装置和计算机系统 - Google Patents

虚拟交换方法、相关装置和计算机系统 Download PDF

Info

Publication number
WO2014206105A1
WO2014206105A1 PCT/CN2014/072502 CN2014072502W WO2014206105A1 WO 2014206105 A1 WO2014206105 A1 WO 2014206105A1 CN 2014072502 W CN2014072502 W CN 2014072502W WO 2014206105 A1 WO2014206105 A1 WO 2014206105A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual machine
virtual
message
data
exchanged
Prior art date
Application number
PCT/CN2014/072502
Other languages
English (en)
French (fr)
Inventor
林洋
郑坤
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP14818411.2A priority Critical patent/EP2996294A4/en
Priority to US14/486,246 priority patent/US9996371B2/en
Publication of WO2014206105A1 publication Critical patent/WO2014206105A1/zh
Priority to US15/979,486 priority patent/US10649798B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45579I/O management, e.g. providing access to device drivers or storage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0806Configuration setting for initial configuration or provisioning, e.g. plug-and-play
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0895Configuration of virtualised networks or elements, e.g. virtualised network function or OpenFlow elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/70Virtual switches

Definitions

  • the present invention relates to the field of computer technology, and more particularly to virtual switching methods, related devices, and computer systems.
  • Network virtualization is a way to separate network traffic from physical network elements using software-based abstractions.
  • Network virtualization has much in common with other forms of virtualization.
  • abstraction isolates network traffic from switches, network ports, routers, and other physical elements in the network. Each physical element is replaced by a virtual representation of the network element. Administrators can configure virtual network elements to meet their unique needs.
  • the main advantage of network virtualization here is the integration of multiple physical networks into a larger logical network.
  • the existing main solutions for network virtualization are VMware's Open Virtual Switch (OVS) and Distributed Virtual Switch (DVS).
  • the virtual switch is implemented in the host host kernel, that is, implemented in the (Virtual Machine Monitor, VMM) kernel, which is at the core of the virtual network.
  • the architecture is shown in Figure 1.
  • the vSwich uses a virtual port port to connect to a virtual machine (VM) and a network interface card (NIC) through FE/BE.
  • the host allocates physical resources such as CPU and memory to the virtual machines and various virtual hardwares.
  • the physical resources are divided into kernel space physical resources and user space physical resources.
  • the vSwitch needs to apply for more hosts during the exchange process. Kernel space physical resources, so it is very unfavorable for Host to manage and allocate resources to the virtual network.
  • the vSwitch is responsible for many tasks and functions, such as the virtual local area network (VLAN), load balancing, tunneling, security, and link aggregation control protocol (Link) shown in Figure 1.
  • VLAN virtual local area network
  • Link link aggregation control protocol
  • Aggregation Control Protocol (LACC), Quality of Service (QoS), etc. are designed to be very large and complex.
  • LACC Layer aggregatation Control Protocol
  • QoS Quality of Service
  • the tight coupling between vSwich and the Host core makes the vSwitch and the entire virtual network very poorly scalable and flexible.
  • Embodiments of the present invention provide a virtual switching method, a related device, and a computer system, which separates a virtual switching function from a kernel, improves scalability and flexibility of the virtual switching device, and deploys the virtual switching function on the virtual machine. And forming a peer node with the common virtual machine, thereby facilitating the Host to manage the virtual network and performing efficient and reasonable resource allocation.
  • a method for virtual switching is provided, which is applied to a computing node, where the computing node includes: a hardware layer, a host Host running on the hardware layer, and a host running on the Host.
  • At least one virtual machine VM wherein the hardware layer includes an input/output I/O device and a storage device, the at least one virtual machine VM includes a first virtual machine having a virtual switching function, and the at least one VM further includes a The second virtual machine, the method includes: the first virtual machine receives a first message sent by a source node, where the first message is used to request the first virtual machine to perform exchange processing on data to be exchanged, where the to-be-exchanged Data is sent from the source node to the target node, at least one of the source node and the target node is the second virtual machine; and the first virtual machine is based on the target node carried by the data to be exchanged
  • the address and the configured port mapping table determine the second message and send the second message, the second message is used to indicate that the target node obtains the storage device from the hardware layer Describe the exchange of data.
  • the method further includes: the first virtual machine receiving a configuration command sent by the host; The first virtual machine configures, according to the configuration command, a first virtual port of the first virtual machine for communicating with the second virtual machine, and configured to communicate with the I/O device a second virtual port of the first virtual machine; the first virtual machine establishes a mapping relationship between the first virtual port and the second virtual port to generate the port mapping table.
  • the method further includes: configuring, by the first virtual machine, according to the configuration command The first shared memory corresponding to the second virtual machine, wherein the first shared memory is a designated storage area on a storage device of the hardware layer.
  • the address and the configured port mapping table determine the second message and send the second message, including: the first virtual machine determining the corresponding first according to the first virtual port used to receive the first message
  • An address of the shared memory is obtained from the first shared memory
  • the I/O device is determined from the port mapping table according to an address of the I/O device carried in the data to be exchanged.
  • Corresponding said second a second port that is configured to carry an address of the first shared memory and a read command, and send the second message to the I/O device through the second virtual port, to facilitate The I/O device reads the data to be exchanged from the first shared memory.
  • the first virtual when the source node is the I/O device, and the target node is the second virtual machine, the first virtual After receiving the first message sent by the source node, the first virtual machine obtains, from the I/O device, an address of the target node carried by the data to be exchanged, and the address of the target node is the second The first virtual machine determines the second message and sends the second message according to the address of the target node and the configured port mapping table carried by the data to be exchanged, and the first virtual machine includes: The address of the second virtual machine queries the port mapping table to determine a first virtual port corresponding to the second virtual machine and determines an address of a first shared memory corresponding to the second virtual machine; The second virtual port corresponding to the I/O device sends a reply message carrying the address of the first shared memory to the I/O device, so that the I/O device sends a message according to the reply message.
  • the VM further includes a third virtual machine, when the source node is the second virtual machine, and the target node is the third virtual machine, the first virtual machine receives a first message sent by the source node, including The first virtual machine receives the first message sent by the second virtual machine by using the first virtual port, where the first message includes indicating the second virtual to the first virtual machine
  • the machine has completed writing a write interrupt to write the data to be exchanged to the second shared memory of the second virtual machine and the third virtual machine pre-negotiated by the first virtual machine, where the second shared memory a specified storage area on the storage device of the hardware layer; the first virtual machine determines a second message according to the address of the target node and the configured port mapping table carried by the data to be exchanged, and sends the second message,
  • the method includes: determining, by the first virtual machine, an address of the second virtual machine corresponding to the first virtual port according to the first virtual port that is used to receive the first message; according to the second virtual machine of And determining, by the address and the address of the
  • the method further includes: receiving read completion indication information sent by the target node, to facilitate the first shared memory or the second Shared memory is released.
  • the method further includes: the first virtual machine according to the data to be exchanged An address of the target node that is carried, the entry that matches the address of the target node is determined in the configured OpenFlow flow table, where the Openflow flow table includes at least one entry, and the entry includes an address.
  • a host machine including: creating a module, After the I/O virtual function of the input/output I/O device is started, at least one virtual machine VM is generated on the host Host, wherein the at least one VM includes a first virtual machine having a virtual switching function, the at least one The VM further includes a second virtual machine, configured to send a configuration command to the first virtual machine, so that the first virtual machine is configured to communicate with the second virtual machine according to the configuration command. a first virtual port of the first virtual machine and configured with a second virtual port of the first virtual machine for communicating with the I/O device.
  • a computing node which is characterized in that it runs on a host Host, and the Host runs on a hardware layer, and the hardware layer includes an input/output I/O device and a storage device.
  • the virtual machine includes: a receiving module, configured to receive a first message sent by the source node, where the first message is used to request the virtual machine to perform exchange processing on the data to be exchanged, where the data to be exchanged is from the source node Sending to the target node, at least one of the source node and the target node is a second virtual machine, where the second virtual machine runs on the Host; and an exchange processing module is configured to be exchanged according to the The address of the target node carried by the data and the port mapping table configured by the virtual machine determine a second message, where the second message is used to indicate that the target node acquires the data to be exchanged from a storage device of the hardware layer; And a module, configured to send the second message to the target node.
  • the method includes: a proxy agent module, configured to configure, according to a configuration command sent by the host, to communicate with the second virtual machine a first virtual port of the virtual machine, and configured with a second virtual port of the virtual machine for communicating with the I/O device; a generating module, configured to establish the first virtual port and the second virtual A mapping relationship between ports to generate the port mapping table.
  • the agent module is further configured to configure, according to the configuration command, a first shared memory corresponding to the second virtual machine, The first shared memory is a designated storage area on a storage device of the hardware layer.
  • the receiving module is specifically configured to receive the first message by using the first virtual port, where the first message includes The virtual machine indicates that the source node has completed writing a write interrupt to write the data to be exchanged to the first shared memory; the exchange processing module is specifically configured to be used according to the method for receiving the first message.
  • the receiving module is specifically configured to receive the first message sent by a source node, where the switching processing module is specifically configured to obtain the An address of the target node carried in the data to be exchanged; querying the port mapping table according to the address of the target node to determine a first virtual port corresponding to the target node, and determining a first shared memory corresponding to the target node
  • the sending module is configured to send, by using the second virtual port corresponding to the source node, a reply message carrying an address of the first shared memory to the source node; And being further configured to: when receiving the write interrupt sent by the source node for indicating to the virtual machine that the source node has completed writing the data to be exchanged to the first shared memory, determining to carry a read
  • the second message of the instruction the sending module is further configured to send the second message to the target node by using the first virtual port; And receiving, by the source node, a write completion interrupt indicating that the source node has
  • the receiving module is configured to receive, by using the first virtual port, the first message sent by the source node, where the a message includes a write completion interrupt;
  • the exchange processing module is configured to determine an address of the source node corresponding to the first virtual port according to the first virtual port used to receive the first message; Determining, by the address of the source node, an address of the target node carried by the data to be exchanged, an address of the second shared memory; determining the second message carrying the address of the second shared memory and a read command;
  • the sending module is configured to send the second message to the target node, where the at least one VM further includes a third virtual machine, the source node is the second virtual machine, and the target node is The third virtual machine.
  • a computing node including: a hardware layer, a host Host running on the hardware layer, and at least one virtual machine VM running on the Host, where the hardware layer Including an input/output I/O device and a storage device, the at least one virtual machine VM includes a first virtual machine having a virtual switching function, the at least one VM further includes a second virtual machine, where: the first virtual machine is configured to receive a first message sent by a source node, where the first message is used for requesting The first virtual machine performs exchange processing on the exchanged data, wherein the data to be exchanged is sent from the source node to the target node, and at least one of the source node and the target node is the second virtual
  • the first virtual machine is further configured to determine a second message according to an address of the target node and a configured port mapping table carried by the data to be exchanged, and send the second message, where the second message is used to indicate The target node acquires the data to be exchanged from a
  • the host is configured to send a configuration command to the first virtual machine, where the first virtual machine is further configured to be configured according to the configuration command.
  • a first virtual port of the first virtual machine that communicates with the second virtual machine, and configured with a second virtual port of the first virtual machine for communicating with the I/O device;
  • a virtual machine is further configured to establish a mapping relationship between the first virtual port and the second virtual port to generate the port mapping table.
  • the first virtual machine is further configured to configure, according to the configuration command, a first shared memory corresponding to the second virtual machine, where The first shared memory is a designated storage area on a storage device of the hardware layer.
  • the source node is configured to write the data to be exchanged into the first shared memory, and the source node is further used to The first virtual machine sends the first message, where the first virtual machine is configured to receive the first message by using the first virtual port, where the first message includes The virtual machine instructs the source node to complete a write completion interrupt writing the data to be exchanged to the first shared memory; and determining the corresponding one according to the first virtual port for receiving the first message Obtaining the address of the first shared memory, obtaining the data to be exchanged from the first shared memory, and determining, according to the address of the I/O device carried in the data to be exchanged, from the port mapping table and the I/ The second virtual port corresponding to the O device; determining the second message carrying the address of the first shared memory and the read command, and sending the first message to the target node by using the second virtual port Second news; The target node is configured to read the data to be exchanged from the first shared memory
  • the first virtual Specifically, the device is configured to receive the first message sent by the source node, obtain an address of the target node carried by the data to be exchanged, and query the port mapping table according to the address of the target node to determine the target node.
  • the target node configured to read the data to be exchanged from the first shared memory according to the second message;
  • the source node is the I/O device, and the target node is the second virtual machine.
  • the source node is further configured to write the data to be exchanged to the source node and the target node by using the first virtual a second shared memory that is pre-negotiated by the machine, wherein the second shared memory is a specified storage area on the storage device of the hardware layer; the source node is further configured to go to the first through the first virtual port
  • the first message is sent by the virtual machine, where the first message includes a write completion interrupt; the first virtual machine is specifically configured to determine the first according to the first virtual port used to receive the first message.
  • the first virtual machine is further configured to release the first shared memory or Said Second shared memory.
  • the first virtual machine after receiving the first message sent by the source node, the first virtual machine is further configured to use the target node that is carried according to the data to be exchanged.
  • An address of the OpenFlow flow table that is configured to match the address of the target node, where the Openflow flow table includes at least one entry, where the entry includes an address, a virtual port, and Executing an action parameter; if the matched entry exists, processing the data to be exchanged according to an execution action parameter corresponding to an address of the target node in the matched entry; if the matched entry does not exist Establishing a new entry that can match the data to be exchanged, and inserting the new entry in the Openflow flow table.
  • a computer system comprising: at least one computing node as described in the fourth aspect.
  • the computing node in the embodiment of the present invention includes: a hardware layer, a host Host running on the hardware layer, and at least one virtual machine VM running on the Host, where the hardware
  • the layer includes an input/output I/O device and a storage device
  • the at least one virtual machine VM includes a first virtual machine having a virtual switching function
  • the at least one VM further includes a second virtual machine; thus, implementing the virtual switching function
  • the virtual switch is equal to the normal VM and forms a peer-to-peer network virtualization architecture.
  • the virtual switch uses the physical resources of the user space in the same way as the normal VM. This facilitates the management and efficiency of the host.
  • the virtual switching method applied to the computing node includes: the first virtual machine receiving a first message sent by the source node, where the first message is used to request the first virtual machine to exchange data to be exchanged, where The exchange data is sent from the source node to the target node, and at least one of the source node and the target node is the second virtual machine; the first virtual machine is based on the target carried by the data to be exchanged
  • the address of the node and the configured port mapping table determine a second message and send the second message, the second message is used to instruct the target node to acquire the data to be exchanged from a storage device of the hardware layer.
  • This method decouples the virtual switching function from the host kernel, reduces the coupling with the host, and can deploy multiple vSwitches in the same host. It is not subject to the Host constraint, so it has stronger scalability, and vSwtich is not decoupled. Relying on the operating system in the Host kernel, it becomes easier to deploy, so it is better portability, and because the configuration module (agent) is separated from the data exchange forwarding module (port mapping table) to be exchanged, it is more The requirements of the software-defined network.
  • FIG. 2 is a schematic diagram of a virtualized software and hardware architecture according to an embodiment of the present invention.
  • 3 is a flow chart of a virtual switching method in accordance with an embodiment of the present invention.
  • 4 is a schematic diagram of a virtual switched data stream in accordance with an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a virtual switched data stream according to another embodiment of the present invention.
  • 6 is a schematic diagram of a virtual switched data stream in accordance with another embodiment of the present invention.
  • Figure 7 is a schematic illustration of a virtual switching device for a software defined network SDN in accordance with another embodiment of the present invention.
  • Figure 8 is a schematic illustration of a distributed implementation of another embodiment of the present invention.
  • 9 is a flow chart of a distributed implementation of another embodiment of the present invention.
  • FIG. 10 is a schematic diagram of a module architecture of a host machine according to an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of a module architecture of a virtual
  • Figure 12 is a schematic illustration of a computer node in accordance with one embodiment of the present invention.
  • Figure 13 is a schematic illustration of a computer system in accordance with one embodiment of the present invention.
  • Virtual Machine VM Virtual machine software can simulate one or more virtual computers on a single physical computer. These virtual machines work like real computers, and operating systems and applications can be installed on virtual machines. Virtual machines also have access to network resources. For an application running in a virtual machine, the virtual machine is like working on a real computer.
  • the hardware layer may include various hardware.
  • the hardware layer of a computing node may include a CPU and a memory, and may also include a network interface card (NIC), a memory, and the like, and a high-speed/low-speed input/output (I/O, Input). /Output ) device, where the NIC is the underlying physical NIC, and the following nickname is the Host NIC to distinguish it from the virtual NIC VM NIC of the virtual machine.
  • NIC network interface card
  • I/O, Input Input
  • /Output input/output
  • Host As a management layer, it manages and allocates hardware resources; presents a virtual hardware platform for virtual machines; implements scheduling and isolation of virtual machines.
  • the Host may be a virtual machine monitor (VMM); or, sometimes, the VMM and a privileged virtual machine work together to form a Host.
  • VMM virtual machine monitor
  • the virtual disk can correspond to one file of Host or one logical block device.
  • the virtual machine runs on the virtual hardware platform that Host prepares for it, and one or more virtual machines run on the Host.
  • the virtual switch connects the virtual machines to each other under the control of the host and accesses the physical network.
  • the virtual switch works like a real virtual machine.
  • the existing virtual switch is implemented in the host kernel and is at the core of the virtual network.
  • Virtual Local Area Network (VLAN) Load-balance Load-balance, Tunnel Tunneling, Security Security, Link Aggregation Control Protocol (LACP), Quality of Service (QoS), and many other functions.
  • VLAN Virtual Local Area Network
  • LACP Link Aggregation Control Protocol
  • QoS Quality of Service
  • shared memory is one of the most in-process ways of inter-process communication.
  • Shared memory allows two or more processes to access the same block of memory.
  • shared memory allows two or more virtual machines and virtual hardware to access the same block of memory.
  • Shared memory is most efficient in all kinds of interprocess communication.
  • Zero copy A technology that avoids the CPU copying data from one piece of storage to another. By reducing or eliminating the operation of the critical communication path affecting the rate, reducing the overhead of data transmission, thereby effectively improving communication performance and realizing high-speed data transmission. 10 Direct, MMAP, etc.
  • SDN Software Defined Network
  • FIG. 2 is a schematic diagram showing the software and hardware architecture of a virtualization solution for deploying a vSwitch to a VM in the embodiment of the present invention.
  • the architecture mainly includes three layers: a hardware layer, a host, and a virtual machine (VM).
  • the hardware layer includes an I/O device, that is, a physical network card NIC, through which the NIC can communicate with other external hosts or networks, and the hardware layer can also include storage devices such as a memory, a hard disk, and the like.
  • Host runs on the hardware layer, where Host may be a virtual machine monitor (VMM), or sometimes VMM and a privileged virtual machine work together, and the two combine to form a Host.
  • VMM virtual machine monitor
  • the second case is shown in Figure 2.
  • At least one virtual machine VM running on the Host one of the VMs is a virtual machine (first virtual machine) having a virtual switching function in the present invention, and may also have a plurality of ordinary virtual machines (a second virtual machine, Third virtual machine, etc.).
  • the Config and Manage Module (CMM) in the Host can send a configuration command to the first virtual machine with virtual switching function (hereinafter referred to as vSwitch) to perform virtualization.
  • vSwitch virtual switching function
  • Network environment configuration and vSwitch Configuration can be configured through a configuration proxy module (agent) in the vSwitch, including port mapping table, VLAN table, and access control list (ACL) management and configuration.
  • ACL access control list
  • the configuration management module of the host can be connected to the agent module of the vSwitch through the IPC (for example, IOCTL, NETLINK, SOCKET, etc.), so that the configuration of the host virtual environment can be transmitted to the vSwitch, which can include the host NIC and the VM backend.
  • Configuration information such as BE, shared memory, and DMA interrupts enable the vSwitch to obtain virtual environment information to establish a corresponding virtual network environment.
  • the configuration management module creates a virtual NIC interface for the VM, and then the configuration management module can negotiate the communication mechanism (communication mode) and port mapping between the vSwitch and the host NIC through the agent module, and negotiate the vSwitch and the VMM.
  • the communication mechanism (communication method) and port mapping between the NICs can further negotiate the shared memory between the vSwitch and the VMM NIC.
  • the vSwitch and the Host NIC can communicate by using 10 straight-through or zero-copy modes.
  • the vSwitch and the VM can communicate using technologies such as shared memory and front-end FE/BE event channel.
  • the entry is established according to the negotiated relationship of each configuration to generate a mapping table, for example, the address of the VM, the port number of the virtual port of the vSwitch corresponding to the VM, and the shared memory address negotiated between the VM and the vSwitch. Corresponding relationship to form an entry, wherein the VM is a normal virtual machine, such as a second virtual machine.
  • the first virtual machine is configured to receive a first message sent by the source node, where the first message is used to request the first virtual machine to exchange data.
  • An exchange process wherein the data to be exchanged is sent from the source node to the target node, and at least one of the source node and the target node is the second virtual machine; the first virtual machine is further configured to carry according to the data to be exchanged
  • the address of the target node and the configured port mapping table determine the second message and send the second message, the second message is used to indicate that the target node acquires the data to be exchanged from the storage device of the hardware layer. Therefore, the data to be exchanged is forwarded through the signaling control of the vSwitch and the switching process.
  • FIG. 3 is a flow chart of a virtual switching method in accordance with an embodiment of the present invention.
  • the method of FIG. 2 is performed by a virtual machine having a virtual switching function (hereinafter referred to as a first virtual machine).
  • the first virtual machine receives a first message sent by the source node, where the first message is used to request that the first virtual machine exchanges data to be exchanged, where the data to be exchanged is sent from the source node to the target node, the source node, and the target node. At least one of the second virtual machines.
  • the first virtual machine is a virtual machine with virtual switching function, which is in the same position as other ordinary virtual machines and runs on the Host.
  • the source node may be a normal virtual machine VM on the host. It should be understood that: the ordinary virtual machine here is a virtual machine or a physical machine external to the host, and may be a virtual machine or a physical machine outside the host.
  • the Host communicates with the outside world through the Host NIC
  • the communication with the virtual machine or the physical machine outside the Host is described as being communicated with the Host NIC, that is, the source node may also be the Host NIC.
  • the target node can be either a normal VM VM on the Host or a Host NIC.
  • the first virtual machine determines the second message according to the address of the target node and the configured port mapping table that is to be exchanged, and sends the second message, where the second message is used to indicate that the target node obtains the storage device from the hardware layer. Exchange data.
  • the configured port mapping table may be configured by the first virtual machine, including initial configuration of the initial port mapping table of the virtualized network establishment and dynamic maintenance of the port mapping table of the virtualized network later operation.
  • the first virtual machine can be only the executor of the configuration command, and the configuration command can be configured by the host or network maintenance personnel.
  • the virtual switch function is deployed in the virtual machine, and the VMM is encapsulated, which facilitates the Host to manage the virtual network and perform efficient and reasonable network resource allocation.
  • the method further includes: receiving a configuration command sent by the host; configuring, according to the configuration command, a first virtual port of the first virtual machine for communicating with the second virtual machine, and configuring a second virtual port of the first virtual machine in communication with the I/O device; establishing a mapping relationship between the first virtual port and the second virtual port to generate a port mapping table.
  • the first virtual machine configures the second virtual machine pair according to the configuration command.
  • the first shared memory where the first shared memory is the designated storage area on the storage device of the hardware layer.
  • the configuration management module in the host can negotiate the communication mechanism (communication mode) and port mapping between the vSwitch and the host NIC through the agent module in the vSwitch, and negotiate the communication mechanism (communication mode) and port mapping between the vSwitch and the VMM NIC.
  • the shared memory between the vSwitch and the VMM NIC may be further negotiated, where the shared memory is a specified storage area on the storage device of the hardware layer.
  • the mapping relationship between the negotiated configurations can be established to generate a port mapping table. For example, the address of the VM, the port number of the vSwitch corresponding to the VM, and the shared memory address negotiated between the VM and the vSwitch are established.
  • the entry of the port mapping table is generated.
  • the first virtual machine receives data to be exchanged from the first virtual port of the first virtual machine, where the first virtual port corresponds to the source node; and the second virtual port of the first virtual machine sends the target node to the target node.
  • the data is exchanged, wherein the second virtual port is determined by the first virtual machine according to the first virtual port and the pre-configured port mapping table.
  • the process of receiving the data to be exchanged from the first virtual port and sending the data to be exchanged to the target node through the second virtual port is a logical switching process of the first virtual machine.
  • the first virtual port in which the first virtual machine communicates with the source node, and the second virtual port in which the first virtual machine communicates with the target node are pre-negotiated and configured.
  • the first virtual machine receives the first message sent by the source node, where: the first virtual machine passes the first
  • the virtual port receives the first message sent by the second virtual machine, where the first message includes a write interrupt for indicating to the first virtual machine that the second virtual machine has completed writing the data to be exchanged to the shared memory;
  • the first virtual port that receives the first message determines the address of the corresponding first shared memory; the data to be exchanged is obtained from the first shared memory, and the address of the I/O device carried according to the data to be exchanged is determined from the port mapping table.
  • the second virtual machine in the Host as the source node establishes a virtual connection with the first virtual port, where the first virtual port is a virtual port corresponding to the second virtual machine pre-configured by the first virtual machine.
  • the second virtual machine sends data to be exchanged to the first virtual port, and the data to be exchanged is actually written into the shared memory that the second virtual machine pre-negotiates with the first virtual machine.
  • the second virtual machine is turned to the first virtual
  • the first virtual machine queries the internal and configured port mapping table to determine the second virtual port and the host NIC Host NIC corresponding to the second virtual port, and sends the read to the Host NIC through the second virtual port.
  • the indication information is obtained, so that the Host NIC reads the data to be exchanged from the shared memory, so that the Host NIC further sends the data to be exchanged to the target node outside the Host.
  • the target node may also be understood as a Host NIC.
  • the first virtual machine when the source node is an I/O device and the target node is the second virtual machine, the first virtual machine further includes: the first virtual machine receiving source after receiving the first message sent by the source node After the first message sent by the node, the first virtual machine obtains the address of the target node carried by the data to be exchanged from the I/O device, and the address of the target node is the address of the second virtual machine; The address of the carried target node and the configured port mapping table determine the second message and send the second message, including: the first virtual machine queries the port mapping table according to the address of the second virtual machine to determine the corresponding to the second virtual machine.
  • the write completion interrupt is completed, the second virtual message is sent to the second virtual machine to facilitate the second virtual The machine reads the data to be exchanged from the first shared memory.
  • the first virtual machine obtains the address of the target node carried by the data to be exchanged from the I/O device, and after the first virtual machine receives the notification of the first message, the I/O device (ie, the underlying physical network card) is learned. After receiving the data to be exchanged, the first virtual machine can directly access the data to be exchanged through the driver layer to obtain the address of the target node it carries.
  • the at least one VM further includes a third virtual machine, where the source node is the second virtual machine, and the target node is the third virtual machine, that is, the source node and the target node are common on the Host.
  • the first virtual machine receives the first message sent by the source node, where the first virtual machine receives the first message sent by the second virtual machine by using the first virtual port, where the first message includes, for indicating to the first virtual machine
  • the second virtual machine has completed writing a write interrupt of the second shared memory in which the data to be exchanged is written into the second virtual machine and the third virtual machine is pre-negotiated by the first virtual machine, where the second shared memory is a hardware layer.
  • the first virtual machine determines the second message according to the address of the target node and the configured port mapping table carried by the data to be exchanged, and sends the second message, including: the first virtual machine is used according to The first virtual port receiving the first message determines an address of the second virtual machine corresponding to the first virtual port; determining the address of the second shared memory according to the address of the second virtual machine and the address of the third virtual machine carried by the data to be exchanged Determining a second message carrying the address of the second shared memory and the read command, and transmitting the second message to the third virtual machine, so that the third virtual machine reads the data to be exchanged from the second shared memory.
  • the second shared memory is negotiated by the second virtual machine and the third virtual machine by using the first virtual machine, and may be negotiated through an event channel of the Xen.
  • the method further includes: receiving read completion indication information sent by the target node, so as to release the first shared memory or the second shared memory. Specifically, after the target node reads the data to be exchanged, it sends the read instruction information to the first virtual machine, and after receiving the read instruction information, the first virtual machine restores the writable permission of the shared memory, that is, releases the shared memory. It should be understood that the first shared memory and the second shared memory described above are only for distinguishing, and the present invention is not limited.
  • the first shared memory and the second shared memory are all part of the memory space specified on the hardware layer storage device, with randomness and uncertainty. For example, after the first shared memory is released, it may instead be allocated as the second shared memory. In this case, the first shared memory and the second shared memory correspond to the same memory space.
  • the port mapping table is an OpenFlow flow table
  • the first virtual machine determines, according to the address of the target node carried in the data to be exchanged, the address of the target node in the Openflow flow table.
  • the Openflow flow table includes at least one entry, the entry includes an address, a virtual port, and an execution action parameter; if the matched entry exists, the first virtual machine is based on the matched entry and the target node The execution action parameter corresponding to the address processes the data to be exchanged; if the matched entry does not exist, the first virtual machine establishes a new entry that can match the data to be exchanged, and inserts a new entry in the Openflow flow table.
  • the computing node in the embodiment of the present invention includes: a hardware layer, a host Host running on the hardware layer, and at least one virtual machine VM running on the Host, wherein the hardware layer includes an input/output I /O device and storage device, the at least one virtual machine VM includes a first virtual machine having a virtual switching function, and the at least one VM further includes a second virtual machine; thus, the virtual switching function is On the virtual machine, the virtual switch and the common VM are in the same priority, forming a peer-to-peer network virtualization architecture.
  • the virtual switch and the normal VM use the physical resources of the user space, so that the host can manage the host. And efficient and rational resource allocation.
  • the virtual switching method applied to the computing node includes: receiving, by the first virtual machine, a first message sent by the source node, where the first message is used to request that the first virtual machine exchanges data to be exchanged, where the data to be exchanged is sent from the source node. Going to the target node, at least one of the source node and the target node is a second virtual machine; the first virtual machine determines the second message according to the address of the target node carried by the data to be exchanged and the configured port mapping table, and sends the second message The second message is used to indicate that the target node acquires data to be exchanged from the storage device at the hardware layer.
  • This method decouples the virtual switching function from the host kernel, and then implements the virtual switching function on the virtual machine, which encapsulates the design and burden of the host kernel, and because of the flexibility and scalability of the VM,
  • the vSwitch and the entire virtual network are improved in scalability and flexibility, facilitating the separation of the control plane and the data plane to meet the requirements of SDN and supporting Openflow.
  • FIG. 4 is a schematic diagram of a virtual switched data stream in accordance with an embodiment of the present invention.
  • the virtual switch vSwitch virtual switching function
  • the proxy agent module in the first virtual machine is connected to the configuration management module (Config and Manage Module) in the host Host, so that the system administrator configures the first virtual machine.
  • the virtual port port of the first virtual machine can be connected to the underlying physical NIC HOST NIC of VM1, VM2 or VMM.
  • the Agent module configures the port mapping and VLAN management of the vSwitch. Specifically, you can negotiate the communication mode between the normal VM and the vSwitch, the shared memory Share Memory and the port, negotiate the communication mode and port of the vSwitch and the HOST NIC, and configure the port of the vSwich. Map to generate a port mapping table.
  • the communication method may include shared memory, 10 through, zero copy or direct memory access (DMA).
  • Shared memory is a mechanism for operating system inter-process communication (IPC).
  • IPC operating system inter-process communication
  • Zero-copy is a technology that avoids the central processor CPU copying data from one piece of storage to another. It is implemented by 10 straight-through, MMAP, and so on.
  • the common VM communicates with the vSwitch in a shared memory manner, and the vSwitch communicates with the Host NIC through a 10-pass or DMA mode, so that the switching device involved in the present invention achieves zero copy, thereby reducing the number of copies. Resource overhead improves exchange efficiency.
  • the VM1 When the VM1 needs to send data to the host NIC, the VM1 first establishes a virtual connection with the first virtual port port1 of the vSwitch.
  • the port1 is the virtual port corresponding to the VM1 in the 401.
  • the corresponding physical process is that VM1 is mapped to the shared memory corresponding to VM1 through its virtual NIC VM NIC.
  • VM1 sends the data to be exchanged to portl through its NIC.
  • the corresponding actual physical process is to write the data to be exchanged to the shared memory corresponding to VM1.
  • the port1 sends a write completion message to the vSwitch to notify the vSwitch to perform the next step.
  • the completion instruction information may be a write completion interrupt.
  • the vSwitch After receiving the write completion indication message sent by VM1, the vSwitch enters the switch processing process to query the port mapping table configured by the Agent module in the vSwitch to determine the outbound port of the data to be exchanged (the second virtual port port2) and the corresponding Host NIC.
  • the port mapping table has a correspondence relationship between information such as an input port, an output port, a source address, and a destination address. Therefore, the vSwitch can determine the output port according to the destination address and port carried in the data to be exchanged, thereby completing the exchange process.
  • the input/output port information here may be the port number of the virtual port of the vSwitch, and the source address/destination address may be the Internet Protocol IP address of the source node/target node or the multimedia access control MAC address.
  • the vSwitch After port 2 is determined, the vSwitch sends a read indication message to the Host NIC through port 2, the read The indication information carries the address of the shared memory in which the data to be exchanged is stored, so that it reads the data to be exchanged in the shared memory.
  • the host NIC After the host NIC reads the data, it can send the data to be exchanged to the device or node connected to the host, and send the read information to the vSwitch through port2, so that the vSwitch can restore the writeable permission of the shared memory, that is, release the shared memory.
  • the reading instruction information may be an interrupt after reading.
  • the specific process of the virtual exchange is illustrated by taking the data to be exchanged as an example.
  • the actual virtual exchange may also be a data flow, signaling, a message, etc., Make a limit.
  • the virtual switching function is implemented on the virtual machine, so that the virtual switch and the ordinary VM are in the same priority, forming a peer-to-peer network virtualization architecture, and the virtual switch and the ordinary VM are in the process of resource allocation. Use the physical resources of the user space, which is convenient for the Host to manage and allocate resources efficiently and reasonably.
  • the virtual switching method applied to the computing node includes: receiving, by the first virtual machine, a first message sent by the source node, where the first message is used to request that the first virtual machine exchanges data to be exchanged, where the data to be exchanged is sent from the source node. Going to the target node, at least one of the source node and the target node is a second virtual machine; the first virtual machine determines the second message according to the address of the target node carried by the data to be exchanged and the configured port mapping table, and sends the second message The second message is used to indicate that the target node acquires data to be exchanged from the storage device at the hardware layer.
  • FIG. 5 is a schematic diagram of a virtual switched data stream according to another embodiment of the present invention.
  • the virtual switch vSwitch (virtual switching function) is deployed on the first virtual machine, so that the first virtual machine becomes a virtual switching device and is in the same position as the ordinary virtual machines VM1 and VM2.
  • the proxy agent module in the first virtual machine is connected to the configuration management module (Config and Manage Module) in the host Host, so that the system administrator configures the first virtual machine.
  • the virtual port port of the first virtual machine can be connected to the underlying physical NIC HOST NIC of VM1, VM2 or VMM.
  • NIC HOST NIC virtual network interface card
  • VM1 normal VM
  • VM1 normal VM
  • FIG. 5 the system architecture shown in FIG. 5 is only an example, and the number of modules such as VM, port, and Host NIC can be expanded. 501, pre-configured.
  • the host NIC After receiving the data to be exchanged from the outside (source node), the host NIC queries the address of the target node (VM1), and sends a request message carrying the address of the VM1 to the vSwitch through port1, where port1 is the agent module in step 501.
  • the virtual port corresponding to the Host NIC is pre-configured, and then the vSwitch driver layer directly accesses the data to be exchanged, and queries the port mapping table pre-configured by the Agent module in the vSwitch to determine the outbound port of the data to be exchanged (the second virtual port port2) ) and the corresponding shared memory. Then, a reply message receiving the shared memory address is sent to the Host NIC through portl. 503. Write data to be exchanged.
  • the Host NIC After receiving the shared memory address, the Host NIC writes the data to be exchanged to the shared memory.
  • the writing mode is pre-configured by the Agent module in step 501, for example, by DMA.
  • the port1 After the Host NIC is written, the port1 sends a write completion message to the vSwitch to notify the vSwitch to perform the next operation.
  • the completion of the completion of the indication can be a write interrupt. 504.
  • VM1 After VM1 reads the data to be exchanged from the shared memory, it sends a read indication message to the vSwitch through port2, so that the vSwitch can restore the writeable permission of the shared memory, that is, release the shared memory.
  • the specific process of the virtual exchange is illustrated by taking the data to be exchanged as an example. In fact, the actual virtual exchange may also be a data flow, signaling, a message, etc., Make a limit.
  • the virtual switching function is implemented on the virtual machine, so that the virtual switch and the ordinary VM are in the same priority, forming a peer-to-peer network virtualization architecture, and the virtual switch and the ordinary VM are in the process of resource allocation.
  • the virtual switching method applied to the computing node includes: receiving, by the first virtual machine, a first message sent by the source node, where the first message is used to request that the first virtual machine exchanges data to be exchanged, where the data to be exchanged is sent from the source node.
  • At least one of the source node and the target node is a second virtual machine; the first virtual machine determines the second message according to the address of the target node carried by the data to be exchanged and the configured port mapping table, and sends the second message The second message is used to indicate that the target node acquires data to be exchanged from the storage device at the hardware layer.
  • This method decouples the virtual switching function from the host kernel, and implements the virtual switching function on the virtual machine, which encapsulates the design and burden of the host kernel, and because of the flexibility and scalability of the VM, This increases the scalability and flexibility of the vSwitch and the entire virtual network.
  • FIG. 6 is a schematic diagram of a virtual switched data stream in accordance with another embodiment of the present invention.
  • the virtual switch vSwitch virtual switching function
  • the proxy agent module in the first virtual machine is connected to the configuration management module (Config and Manage Module) in the host Host, so that the system administrator configures the first virtual machine.
  • the virtual port port of the first virtual machine can be connected to the underlying physical NIC HOST NIC of VM1, VM2 or VMM.
  • the configuration command can be sent to the agent module in the first virtual machine by using the Config and Manage Module on the host, so that the agent module configures the port mapping and VLAN management of the vSwitch.
  • the specific configuration process and configuration items are similar to the foregoing step 301 in FIG. 3, and details are not described herein again.
  • VM1 can negotiate with VM2 through vSwitch, and a shared memory is created by vSwitch for VM1 and VM2 to share.
  • the specific negotiation process can be performed using the mechanism of the Xen event channel.
  • VM1 establishes a virtual connection with the first virtual port port1 of the vSwitch, where port1 is a virtual port corresponding to VM1 pre-configured by the agent module in step 601.
  • the corresponding physical process is that VM1 maps to the shared memory negotiated by VM1 and VM2 through its virtual NIC VM NIC.
  • VM1 sends the data to be exchanged to port1 through its NIC.
  • the corresponding actual physical process is to write the data to be exchanged to the shared memory corresponding to VM1.
  • the port1 sends a write completion message to the vSwitch to notify the vSwitch to perform the next step.
  • Read data to be exchanged The vSwitch sends a read indication message to VM2 to read the data to be exchanged in the shared memory.
  • the host NIC After the host NIC reads the data, it sends the data to be exchanged to the target node outside the Host, and sends a read completion message to the vSwitch, so that the vSwitch can restore the writeable permission of the shared memory, that is, release the shared memory.
  • the specific process of the virtual exchange is illustrated by taking the data to be exchanged as an example. In fact, the actual virtual exchange may also be a data flow, signaling, a message, etc., Make a limit.
  • the virtual switching function is implemented on the virtual machine, so that the virtual switch and the ordinary VM are in the same priority, forming a peer-to-peer network virtualization architecture, and the virtual switch and the ordinary VM are in the process of resource allocation.
  • the virtual switching method applied to the computing node includes: receiving, by the first virtual machine, a first message sent by the source node, where the first message is used to request that the first virtual machine exchanges data to be exchanged, where the data to be exchanged is sent from the source node.
  • At least one of the source node and the target node is a second virtual machine; the first virtual machine determines the second message according to the address of the target node carried by the data to be exchanged and the configured port mapping table, and sends the second message The second message is used to indicate that the target node acquires data to be exchanged from the storage device at the hardware layer.
  • This method decouples the virtual switching function from the host kernel, and implements the virtual switching function on the virtual machine, which encapsulates the design and burden of the host kernel, and because of the flexibility and scalability of the VM, Make vSwitch And the scalability and flexibility of the entire virtual network has been improved.
  • FIG. 7 is a schematic diagram of a virtual switching device for a software defined network SDN according to another embodiment of the present invention.
  • the invention decouples the virtual switch vSwitch from the host core and deploys the vSwitch to the first virtual machine, which simplifies the design and complexity of the host kernel. And because the configurability, the scalability, and the flexibility of the virtual machine are high, the scalability and flexibility of the vSwitch and the entire virtualized network are also improved, so that the virtual switching device of the embodiment of the present invention can implement the control plane and the control plane.
  • the separation of the data plane data plane that is, to meet the needs of SDN.
  • SDN is a new-generation network architecture. Different from the traditional network architecture, the protocol layering, control plane and data plane are different. SDN integrates the protocol at the operation and control level, and separates the control plane from the data plane.
  • a typical SDN solution is OpenFlow Openflow.
  • the OpenFlow is implemented on the first virtual machine with the virtual switching function in the embodiment of the present invention.
  • the logical implementation of the switching device can be divided into two parts: an OpenFlow Controller (OpenFlow Controller). And OpenFlow Flowtable, where the OpenFlow controller is responsible for the control plane, for network topology configuration, data forwarding policy adjustment, configuration and maintenance of Openflow flow table, Openflow flow table is responsible for data plane, is data flow The query mapping table sent.
  • the present invention can adopt the following two deployment modes: First, the Openflow Controller and the Openflow Flowtable are implemented in the same VM, that is, the first virtual virtual switching function in the present invention.
  • Machine where Openflow Controller is implemented in user space, and Flowtable can be implemented in user space or in kernel space.
  • Second, Openflow Controller and Openflow Flowtable are implemented in two virtual machines with virtual switching functions, for example, The OpenFlow Controller may be deployed in the first virtual machine, and the at least one VM running on the Host further includes a fourth virtual machine having a virtual switching function, and the fourth virtual machine is similar to the first virtual machine, and the two are used.
  • the controller and the FlowTable of the virtual switch vSwitch are deployed on the first virtual machine, or are deployed on two different virtual machines, so that the vSwitch is in the same position as the ordinary virtual machines VM1 and VM2.
  • the proxy agent module in the Controller and The configuration management module (Config and Manage Module) in the host Host is connected so that the system administrator can configure the vS witch.
  • the virtual port port of the Flowtable section can be connected to the underlying physical NIC HOST NIC of VM1, VM2 or VMM. It should be understood that the system architecture shown in FIG. 7 is only an example, and the number of modules such as VM, port, and Host NIC can be expanded. Openflow Controller and Flowtable work together to implement forwarding of service flows.
  • the Controller contains a user configuration database and a rule base.
  • Flowtable is a table structure in business flow units, including matching and execution and parts. Each entry of the Flowtable represents a service flow, and the matching part is a field such as IP, MAC, and Port to be exchanged.
  • the execution part indicates processing for matching the data to be exchanged, including forwarding, packet loss, and applying for a new entry to the Controller. For example, when the data to be exchanged arrives at the vSwitch, the vSwitch checks the IP, Mac, and Port fields of the data to be exchanged, and searches for the Flowtable to find a matching entry. If a matching entry is found, the operation is performed according to the Action; if no matching entry is found.
  • the Flowtable sends an entry request to the Controller. After receiving the request, the Controller queries the rule base, creates a new entry, and sends it to the Flowtable. The Flowtable inserts a new entry, and the data to be exchanged according to the entry is followed by the rule. Forward.
  • the virtual switching function is implemented on the virtual machine, so that the virtual switch and the ordinary VM are in the same priority, forming a peer-to-peer network virtualization architecture, and the virtual switch and the ordinary VM are in the process of resource allocation. Use the physical resources of the user space, which is convenient for the Host to manage and allocate resources efficiently and reasonably.
  • This method decouples the virtual switching function from the host kernel, reduces the coupling between the host and the vSwitch, and can deploy multiple vSwitches in the same host without being bound by the host, and because the VM has flexibility and good scalability, This increases the scalability and flexibility of the vSwitch and the entire virtual network.
  • the present invention also separates the configuration module from the data exchange forwarding module to be exchanged, and more conforms to the programmable network design, thereby enabling SDN to be implemented on the virtualized network architecture of the embodiment of the present invention.
  • Figure 8 is a schematic illustration of a distributed implementation of another embodiment of the present invention. As shown in Figure 8.
  • the configuration of the embodiment of the present invention includes a master virtual switch Master vSwitch and two slave virtual switches Slave vSwitch.
  • FIG. 8 shows only two slave vSwitches for convenience of description, which is not limited to the present invention. In fact, there can be several slave vSwitches.
  • Each host in FIG. 8 is the same as the Host running on the hardware layer described in the above embodiment, and these Hosts may be hardware layers running on the same physical machine.
  • the Host above may also be a Host running on a hardware layer of a different physical machine, which is not limited by the present invention.
  • Each of the vSwitches is a virtual machine with virtual switching function according to the present invention, that is, each vSwitch is similar to the first virtual machine having the virtual switching function in the foregoing embodiment.
  • the master management module and the slave management module in each host may correspond to the configuration management module Config and Manage Module in the above embodiment Host.
  • the control management module of the master vSwitch is set as the master management module Master Manager, Slave vSwitch.
  • the control management module is configured as the slave management module Slave Manager.
  • the Master Manager and the Slave Manager manage the vSwitch of the Host in the same way as the above embodiments. You can configure and manage the vSwitch through the agent module in the vSwitch.
  • the Master Manager is a user-configured interface that can be directly configured by the user through the client program.
  • the Master Manager communicates with the Slave Manager through ten meetings, and the port mapping between the ten vSwitches of the ten operators.
  • the communication between the master and the slave manager is the control flow, and the communication between the master and the slave is the data flow.
  • the configuration process of the distributed vSwitch in the embodiment of the present invention is as follows: First, create a master on a host. vSwitch, then create a vSwitch cascading configuration, including each Slave vSwi Tch, and the IP address and port mapping on all the vSwitches.
  • the configuration information is sent to other hosts through the configuration protocol.
  • the host that hosts the master vSwitch is the host, and the other hosts that receive the configuration information are the secondary hosts.
  • Each of the subordinate hosts to the configuration information creates a control management module, that is, the subordinate management module.
  • each subordinate management module follows the IP address and port on the slave vSwitch corresponding to the received configuration information configurator.
  • the configuration protocol involved includes, but is not limited to, an application protocol such as Extensible Markup Language (XML) and Hypertext Transfer Protocol (HTTP).
  • XML Extensible Markup Language
  • HTTP Hypertext Transfer Protocol
  • the configuration process of the distributed switching architecture of the embodiment of the present invention is as shown in FIG. 9: 901; The user logs in to the Manage Module in HostO to create a vSwitch instance and defines it as the Master.
  • the Host Module of Hostl and Host2 receives the configuration message, creates a vSwitch instance according to the configuration requirements, and defines it as a slave, and then points its Master pointer to the vSwitch of HostO. Configure the port mapping of the vSwitch based on the port mapping in the configuration.
  • the embodiment of the present invention decouples the virtual switch function from the host core, and reduces the coupling between the host and the vSwitch.
  • the multiple vSwitches can be deployed in the same host without being bound by the host, and the vSwitch is implemented in the user operating system Guest OS.
  • FIG. 10 is a schematic diagram of a module architecture of a host machine according to an embodiment of the present invention.
  • the host 1000 of FIG. 10 includes a creation module 1001 and a configuration module 1002.
  • a creating module 1001 configured to generate at least one virtual machine VM in the host Host after the I/O virtual function of the input/output I/O device is started, where at least one VM includes a first virtual machine having a virtual switching function, The at least one VM further includes a second virtual machine; the configuration module 1002 is configured to send a configuration command to the first virtual machine, so that the first virtual machine configures the first virtual machine for communicating with the second virtual machine according to the configuration command. a first virtual port and a second virtual port of the first virtual machine configured to communicate with the I/O device.
  • the host 1000 of the embodiment may be a host in the foregoing method embodiment, and the functions of the respective functional modules may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to the foregoing method embodiment. Related descriptions are not described here.
  • the Host 100 after the I/O virtual function of the I/O device is started, the Host 100 generates at least one virtual machine running on the Host 100 through the creation module 1001.
  • the creating module 1001 may be a configuration management module (Config and Manage Module), and the creating module 1001 may also create a virtual network card interface (VM NIC) of the virtual machine by using a tool such as Qemu, in the virtual machine generated by the creating module 1001.
  • VM NIC virtual network card interface
  • the module 1002 that is, the Config and Manage Module, sends a configuration command to the Agent module, where the configuration module 1002 is connected to the Agent through the inter-process communication technology IPC (such as IOCTL, NETLINK, SOCKET, etc.), and the configuration module 1002 connects the Host virtual environment.
  • IPC inter-process communication technology
  • the agent of the virtual machine may include configuration information of the physical MAC card of the lower layer of the host 1000, the FE/BE of the virtual machine, the shared memory, and the DMA interrupt, so that the first virtual machine obtains the virtual environment information, thereby establishing a corresponding virtual network environment. .
  • the virtual switching function can be stripped and decoupled from the HostlOOO kernel, and the virtual switching function can be realized on the first virtual machine, which embodies the design and burden of the Host kernel, and
  • the VM's flexibility and scalability make the vSwitch and the entire virtual network more scalable and flexible.
  • the virtual switching function is implemented on the virtual machine, the virtual switch has the same status as the ordinary VM, has the same priority, and forms a peer-to-peer network virtualization architecture, and the virtual switch and the ordinary VM are used for resource allocation.
  • the physical resources of the user space are used, which facilitates the management of the Host and efficient and rational resource allocation.
  • FIG. 11 is a schematic diagram of a module architecture of a virtual machine according to an embodiment of the present invention.
  • the virtual machine 1100 of FIG. 11 includes a receiving module 1101, an exchange processing module 1102, and a transmitting module 1103.
  • the receiving module 1101 is configured to receive a first message sent by the source node, where the first message is used to request the virtual machine 1100 to perform data exchange processing on the data to be exchanged, where the data to be exchanged is sent from the source node to the target.
  • At least one of the source node and the target node is a second virtual machine, and the second virtual machine is running on the Host.
  • the exchange processing module 1102 is configured to carry according to the data to be exchanged.
  • the address of the target node and the port mapping table configured by the virtual machine 1100 determine a second message, where the second message is used to indicate that the target node acquires the data to be exchanged from a storage device of the hardware layer; 1103.
  • the second message is sent to the target node.
  • the virtual machine 1100 of the embodiment of the present invention is a virtual machine with a virtual switching function, and has the same status as other common virtual machines, and is deployed on the Host.
  • the source node can be a normal virtual machine on the host, or a virtual machine or a physical machine outside the Host.
  • the target node can also be a normal virtual machine on the Host, or a virtual machine or a physical machine outside the Host.
  • the virtual machine 1100 of the embodiment of the present invention may be the first virtual machine having the virtual switching function in the foregoing method embodiment, and the functions of the respective functional modules may be specifically implemented according to the method in the foregoing method embodiment.
  • the virtual switch function is deployed in the virtual machine, and the VMM is encapsulated, which facilitates the Host to manage the virtual network and perform efficient and reasonable network resource allocation.
  • the virtual machine 1100 further includes a proxy agent module 1104 and a generating module 1105.
  • the proxy agent module 1104 is configured to configure, according to a configuration command sent by the host, a first virtual port 1106 of a virtual machine for communicating with the second virtual machine, and configure a virtual for communicating with the I/O device.
  • the generating module 1105 is configured to establish a mapping relationship between the first virtual port 1106 and the second virtual port 1107 to generate a port mapping table.
  • the agent module 1104 is further configured to configure, according to the configuration command, the first shared memory corresponding to the second virtual machine, where the first shared memory is a designated storage area on the storage device of the hardware layer.
  • the first shared memory may be negotiated by an event channel between the second virtual machine and the virtual machine 1100.
  • the receiving module 1101 is configured to receive, by using the first virtual port 1106, a first message, where the first message includes a write completion interrupt for indicating to the virtual machine 1100 that the source node has completed writing the data to be exchanged to the first shared memory;
  • the module 1102 is specifically configured to determine, according to the first virtual port 1106 for receiving the first message, an address of the corresponding first shared memory, and obtain an address of the target node carried by the data to be exchanged from the first shared memory, so as to determine the target node.
  • the sending module 1103 is specifically configured to send, by using the second virtual port 1107 corresponding to the first virtual port 1106 in the port mapping table, a second message to the target node, where the source node is the second virtual machine, and the target node is the I/O device.
  • the receiving module 1101 is configured to receive a first message sent by the source node, where the switching processing module 1102 is configured to obtain an address of the target node carried by the data to be exchanged, and query according to the address of the target node.
  • the port mapping table is configured to determine a first virtual port 1106 corresponding to the target node and determine an address of the first shared memory corresponding to the second virtual machine.
  • the sending module 1103 is specifically configured to use the second virtual port corresponding to the I/O device. 1107, sending a reply message carrying the address of the first shared memory to the target node; the exchange processing module 1102 is further configured to: after receiving the source node, indicating to the virtual machine 1100 that the source node has completed writing the data to be exchanged When the write of the shared memory is interrupted, the second message carrying the read command is determined; the sending module 1103 is further configured to send the second message to the target node by using the first virtual port 1106; the receiving module 1101 is further configured to receive The source node sends a write interrupt indicating that the source node has completed writing the data to be exchanged to the first shared memory.
  • the source node is an I/O device and the target node is a second virtual machine.
  • the receiving module 1101 is configured to receive, by using the first virtual port 1106, the first message sent by the source node, where the first message includes a write completion interrupt; and the exchange processing module 1102 is specifically configured to use Determining, by the first virtual port 1106 of the first message, an address of the corresponding source node; determining an address of the second shared memory according to an address of the source node and an address of the target node carried by the data to be exchanged; determining that the second shared memory is carried The address and the second message of the read instruction; the sending module 1103 is specifically configured to send the second message to the target node.
  • the receiving module 1101 is further configured to: receive the read completion indication information sent by the target node, so that the virtual machine 1100 releases the first shared memory or the second shared memory.
  • the first virtual machine obtains the address of the target node carried by the data to be exchanged from the I/O device, and after the first virtual machine receives the notification of the first message, the I/O device (ie, the underlying physical network card) is learned. After receiving the data to be exchanged, the first virtual machine can directly access the data to be exchanged through the driver layer to obtain the address of the target node it carries.
  • the I/O device ie, the underlying physical network card
  • the first virtual machine 1231 further includes an Openflow controller that includes the Agent module 1104, where: the receiving module 1101 receives the sending by the source node.
  • the switch processing module 1102 is further configured to: determine, according to an address of the target node carried in the data to be exchanged, an entry that matches an address of the target node in the Openflow flow table, where the Openflow flow table includes at least An entry, the entry includes an address, a virtual port, and an action parameter; if the matched entry exists, the data to be exchanged is processed according to the action parameter corresponding to the address of the target node in the matched entry; if the matched entry If no, the entry entry request is sent to the Openflow controller, so that the Openflow controller creates a new entry that matches the data to be exchanged according to the entry establishment request, and inserts a new entry in the Openflow flow table.
  • the embodiment of the present invention deploys the virtual switching function to the virtual machine 1100, so that the virtual machine 1100 with the virtual switching function is in the same position as other common virtual machines, thereby facilitating the Host to manage the virtual network and performing an efficient and reasonable network. Resource allocation. And because the virtual switching function is stripped from the Host core, the scalability is enhanced, and the virtual machine 1100 meets the requirements of SDN, and supports Openflow.
  • Figure 12 is a schematic illustration of a computer node in accordance with one embodiment of the present invention. The computing node 1200 shown in FIG.
  • the 12 can include: a hardware layer 1210, a host Host 1220 running on the hardware layer 1210, and at least one virtual machine VM 1230 running on the Host 1220; wherein the hardware layer 1210 includes input/ The output I/O device 1211 and the storage device 1212 include at least one virtual machine VM 1230 including a first virtual machine 1231 having a virtual switching function, and at least one VM 1230 further includes a second virtual machine 1232.
  • the first virtual machine 1231 is configured to receive a first message sent by the source node, where the first message is used to request the first virtual machine 1231 to exchange data to be exchanged, where the data to be exchanged is sent from the source node to the target node, and the source At least one of the node and the target node is a second virtual machine 1232; the first virtual machine 1231 is further configured to determine a second message according to the address of the target node and the configured port mapping table carried by the data to be exchanged, and send the second The message is used to indicate that the target node acquires data to be exchanged from the storage device 1212 of the hardware layer.
  • the Hostl 220 is configured to send a configuration command to the first virtual machine 1231.
  • the first virtual machine 1231 is further configured to configure, by using a proxy agent module of the first virtual machine, to communicate with the second virtual machine according to the configuration command.
  • a first virtual port of a virtual machine and configured with a second virtual port of the first virtual machine for communicating with the I/O device 1211; the first virtual machine 1231 is further configured to establish the first virtual port and the second virtual Mapping between ports to generate a port mapping table.
  • the first virtual machine 1231 is further configured to configure, according to the configuration command, the first shared memory corresponding to the second virtual machine 1232, where the first shared memory is a designated storage area on the storage device 1212 of the hardware layer 1210.
  • the source node 1232 is configured to write the data to be exchanged to the first share.
  • the source node 1232 is further configured to send a first message to the first virtual machine 1231.
  • the first virtual machine 1231 is configured to receive, by using the first virtual port, the first message, the first message.
  • the message is sent to the target node 1211 by using the second virtual port corresponding to the first virtual port in the port mapping table; the target node 1211 is configured to read the data to be exchanged from the first shared memory according to the second message;
  • the first virtual machine 1231 is specifically configured to receive the first sent by the source node 1211.
  • a message obtaining an address of the target node 1232 carried in the data to be exchanged; querying the port mapping table according to the address of the target node 1232 to determine the target node 12 32 corresponding to the first virtual port and determining the address of the first shared memory corresponding to the second virtual machine 1232; sending the address carrying the first shared memory to the target node 1232 through the second virtual port corresponding to the I/O device 1211 Reply message; and, when receiving the write interrupt sent by the source node 1211 for indicating to the first virtual machine that the source node 1211 has completed writing the data to be exchanged to the first shared memory, determining to carry the read command a second message, sending a second message to the target node 1232 through the first virtual port; the source node 1211 is further configured to write the data to be exchanged into the first shared memory according to the address of the first shared memory in the reply message; the source node 1211 And sending, to the first virtual machine, a write completion interrupt indicating that the source node 1211 has completed writing the data
  • the source node and the target node are both at least one common virtual machine in the VM 1230
  • the source is assumed The point is the second virtual machine 1232
  • the target node is the third virtual machine 1233: the source node 1232 is further configured to write the data to be exchanged to the second shared memory that the source node 1232 and the target node 1233 pre-negotiate through the first virtual machine 1231.
  • the second shared memory is a designated storage area on the storage device 1212 of the hardware layer 1210;
  • the source node 1232 is further configured to send, by using the first virtual port, the first message to the first virtual machine, where the first message includes a write completion interrupt;
  • the first virtual machine 1231 is specifically configured to use, according to the first virtual
  • the port determines the address of the corresponding source node 1232; determines the address of the second shared memory according to the address of the source node 1232 and the address of the target node 1233 carried by the data to be exchanged; determines the address carrying the second shared memory and the read command And the second message is sent to the target node 1233.
  • the target node 1233 is configured to read the data to be exchanged from the second shared memory according to the second message.
  • the target node may send the read indication information to the first virtual machine 1231 to facilitate the first shared memory or the second.
  • the shared memory is released; the first virtual machine 1231 releases the first shared memory or the second shared memory after receiving the read instruction information.
  • the first virtual machine obtains the address of the target node carried by the data to be exchanged from the I/O device, and after the first virtual machine receives the notification of the first message, the I/O device (ie, the underlying physical network card) is learned.
  • the first virtual machine can directly access the data to be exchanged through the driver layer to obtain the address of the target node it carries.
  • the first virtual machine 1231 is further configured to: configure OpenFlow according to the address of the target node carried by the data to be exchanged.
  • the flow table determines an entry that matches the address of the target node, where the Openflow flow table includes at least one entry, the entry includes an address, a virtual port, and an execution action parameter; if the matched entry exists, according to the matching table
  • the execution action parameter corresponding to the address of the target node processes the data to be exchanged; if the matched entry does not exist, a new entry matching the data to be exchanged is created, and a new entry is inserted in the Openflow flow table.
  • the computing node 1200 in the embodiment of the present invention may include: a hardware layer 1210, a host Hostl 220 running on the hardware layer 1210, and at least one virtual machine VM1230 running on the Hostl 220, where
  • the hardware layer includes an input/output I/O device 1211 and a storage device 1212
  • the at least one virtual machine VM includes a first virtual machine 1231 having a virtual switching function
  • the at least one VM further includes a second virtual machine 1232
  • the virtual switching function is implemented on the virtual machine, so that the virtual switch and the ordinary VM are in the same priority, forming peer-to-peer network virtualization.
  • the virtual switch and the normal VM use the physical resources of the user space when the resource is allocated.
  • the virtual switching method applied to the computing node includes: the first virtual machine receiving a first message sent by the source node, where the first message is used to request the first virtual machine to exchange data to be exchanged, where Declaring exchange data is sent from the source node to the target node, at least one of the source node and the target node is the second virtual machine; and the first virtual machine is based on the target carried by the data to be exchanged
  • the address of the node and the configured port mapping table determine a second message and send the second message, the second message is used to instruct the target node to acquire the data to be exchanged from a storage device of the hardware layer.
  • This method decouples the virtual switching function from the host kernel, reduces the coupling with the Host, and can deploy multiple vSwitches in the same Host, which is not subject to the Host constraint. Therefore, it has stronger scalability, and vSwtich is not decoupled. Relying on the operating system in the host kernel, it becomes easier to deploy, so it has better portability, and because the configuration module (Agent) is separated from the data exchange forwarding module (port mapping table) to be exchanged, it is more in line with the software definition. Network requirements.
  • FIG. 13 is a schematic illustration of a computer system in accordance with one embodiment of the present invention.
  • an embodiment of the present invention further provides a computer system 1300, which may include:
  • At least one compute node 1200 At least one compute node 1200.
  • the computing node 1200 in the computer system 1300 of the embodiment of the present invention may include: a hardware layer, a host Host running on the hardware layer, and at least one virtual machine VM running on the Host,
  • the hardware layer includes an input/output I/O device and a storage device
  • the at least one virtual machine VM includes a first virtual machine having a virtual switching function
  • the at least one VM further includes a second virtual machine
  • the virtual switching function is implemented on the virtual machine, so that the virtual switch and the normal VM are in the same priority, forming a peer-to-peer network virtualization architecture, when resource allocation is performed.
  • the virtual switch and the normal VM use the physical resources of the user space, which facilitates the Host to manage and efficiently allocate resources such as bandwidth, CPU, and storage.
  • the virtual switching method applied to the computing node includes: the first virtual machine receiving a first message sent by the source node, where the first message is used to request the first virtual machine to exchange data to be exchanged, where Declaring exchange data is sent from the source node to the target node, at least one of the source node and the target node is the second virtual machine; and the first virtual machine is based on the target carried by the data to be exchanged
  • the address of the node and the configured port mapping table determine a second message and send the second message, the second message is used to instruct the target node to acquire the data to be exchanged from a storage device of the hardware layer.
  • This method decouples the virtual switch function from the host kernel, reduces the coupling with the host, and can deploy multiple vSwitches in the same host, which is not subject to the Host constraint. Therefore, it has stronger scalability, and vSwtich is not decoupled. Relying on the operating system in the host kernel, it becomes easier to deploy, so it has better portability, and because the configuration module (Agent) is separated from the data exchange forwarding module (port mapping table) to be exchanged, it is more in line with the software definition. Network requirements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明实施例提供一种虚拟交换方法、相关装置和计算机系统,该方法包括:接收源节点发送的第一消息,第一消息用于请求第一虚拟机对待交换数据进行交换处理,其中待交换数据是从源节点发往目标节点的,源节点和目标节点中的至少一个为第二虚拟机;根据待交换数据携带的目标节点的地址和配置的端口映射表确定第二消息并发送所述第二消息,第二消息用于指示目标节点从硬件层的存储设备获取待交换数据。本发明实施例通过将虚拟交换功能部署到虚拟机中,使得具有虚拟交换功能的虚拟机与其他普通虚拟机处于同等地位,从而有利于Host对虚拟网络进行管理并进行高效、合理的网络资源分配。并且由于虚拟交换功能从Host核心中剥离,从而增强了扩展性。

Description

虚拟交换方法、 相关装置和计算机系统
本申请要求于 2013 年 06 月 28 日提交中国专利局、 申请号为 201310270272.9、 发明名称为 "虚拟交换方法、相关装置和计算机系统" 的中 国专利申请的优先权, 其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域, 并且更具体地, 涉及虚拟交换方法、 相关装 置和计算机系统。
背景技术
网络虚拟化是使用基于软件的抽象从物理网络元素中分离网络流量的一 种方式。 网络虚拟化与其他形式的虚拟化有很多共同之处。 对网络虚拟化来说, 抽象隔离了网络中的交换机、 网络端口、 路由器以及 其他物理元素的网络流量。 每个物理元素被网络元素的虚拟表示形式所取代。 管理员能够对虚拟网络元素进行配置以满足其独特的需求。网络虚拟化在此处 的主要优势是将多个物理网络整合进更大的逻辑网络中。 现有的网络虚拟化主要方案为 VMware的开放式虚拟交换 ( Open Virtual Switch, OVS )和分布式虚拟交换(Distributed Virtual Switch, DVS )。 针对主 流的 OVS架构, 虚拟交换机 ( Virtual Switch, vSwitch )在主机 Host内核中实 现, 即在 ( Virtual Machine Monitor, VMM ) 内核中实现, 处于虚拟网络的核 心位置, 其架构如图 1所示。 其中 vSwich使用虚拟端口 port, 通过 FE/BE与 连接虚拟机 ( Virtual Machine, VM ) 以及底层网卡( Network Interface Card, NIC )。 Host为其上运行的虚拟机以及各种虚拟硬件分配诸如 CPU、 内存等物 理资源,这些物理资源划分为内核空间物理资源和用户空间物理资源, vSwitch 在交换处理过程中需要申请占用较多的 Host内核空间物理资源, 因而非常不 利于 Host对虚拟网络进行管理和资源分配。 vSwitch负担了诸多任务和功能, 例如图 1中示出的虚拟局域网 ( Virtual Local Area Network , VLAN )、 负载均 衡 Load-balance、 隧道 Tunneling, 安全 Security, 链路汇聚控制协议 ( Link Aggregation Control Protocol, LACP )、 服务质量( Quality of Service, QoS )等 等, 其设计非常庞大和复杂, vSwich与 Host内核的紧密耦合使得 vSwitch以 及整个虚拟网络的扩展性和灵活性都很差。
发明内容 本发明实施例提供一种虚拟交换方法、相关装置和计算机系统, 将虚拟交 换功能从内核中剥离,提高了虚拟交换设备的扩展性和灵活性, 并将虚拟交换 功能部署在虚拟机上, 与普通虚拟机形成对等节点, 从而有利于 Host对虚拟 网络进行管理并进行高效、 合理的资源分配。 第一方面, 提供了一种虚拟交换的方法, 应用于计算节点上, 所述计算节 点包括:硬件层、运行在所述硬件层之上的宿主机 Host、以及运行在所述 Host 之上的至少一个虚拟机 VM, 其中, 所述硬件层包括输入 /输出 I/O设备和存储 设备,所述至少一个虚拟机 VM包括具有虚拟交换功能的第一虚拟机,所述至 少一个 VM还包括第二虚拟机,所述方法包括: 所述第一虚拟机接收源节点发 送的第一消息,所述第一消息用于请求所述第一虚拟机对待交换数据进行交换 处理, 其中所述待交换数据是从所述源节点发往目标节点的, 所述源节点和所 述目标节点中的至少一个为所述第二虚拟机;所述第一虚拟机根据所述待交换 数据携带的目标节点的地址和配置的端口映射表确定第二消息并发送所述第 二消息,所述第二消息用于指示所述目标节点从所述硬件层的存储设备获取所 述待交换数据。 结合第一方面,在其第一种实现方式中, 所述第一虚拟机接收源节点发送 的第一消息之前, 还包括: 所述第一虚拟机接收所述 Host发送的配置命令; 所述第一虚拟机根据所述配置命令配置用于与所述第二虚拟机进行通信的所 述第一虚拟机的第一虚拟端口,并配置用于与所述 I/O设备进行通信的所述第 一虚拟机的第二虚拟端口;所述第一虚拟机建立所述第一虚拟端口与所述第二 虚拟端口之间的映射关系, 以生成所述端口映射表。 结合第一方面及其上述实现方式,在其第二种实现方式中, 所述接收所述 Host发送的配置命令之后, 还包括: 所述第一虚拟机根据所述配置命令配置 所述第二虚拟机对应的第一共享内存,其中所述第一共享内存为所述硬件层的 存储设备上的指定存储区域。 结合第一方面及其上述实现方式,在其第三种实现方式中, 当所述源节点 为所述第二虚拟机, 所述目标节点为所述 I/O设备时, 所述第一虚拟机接收源 节点发送的第一消息, 包括: 所述第一虚拟机通过所述第一虚拟端口接收所述 第二虚拟机发送的所述第一消息,所述第一消息包括用于向所述第一虚拟机指 示所述第二虚拟机已完成将所述待交换数据写入所述第一共享内存的写完中 断;所述第一虚拟机根据所述待交换数据携带的目标节点的地址和配置的端口 映射表确定第二消息并发送所述第二消息, 包括: 所述第一虚拟机根据用于接 收所述第一消息的所述第一虚拟端口确定对应的所述第一共享内存的地址;从 所述第一共享内存获取所述待交换数据, 根据所述待交换数据携带的所述 I/O 设备的地址从所述端口映射表中确定与所述 I/O设备对应的所述第二虚拟端 口; 确定携带有所述第一共享内存的地址和读取指令的所述第二消息, 并通过 所述第二虚拟端口向所述 I/O设备发送所述第二消息, 以便于所述 I/O设备从 所述第一共享内存读取所述待交换数据。 结合第一方面及其上述实现方式,在其第四种实现方式中, 当所述源节点 为所述 I/O设备, 所述目标节点为所述第二虚拟机时, 所述第一虚拟机接收源 节点发送的第一消息之后还包括: 所述第一虚拟机从所述 I/O设备获取所述待 交换数据携带的目标节点的地址,所述目标节点的地址为所述第二虚拟机的地 址;所述第一虚拟机根据所述待交换数据携带的目标节点的地址和配置的端口 映射表确定第二消息并发送所述第二消息, 包括: 所述第一虚拟机根据所述第 二虚拟机的地址查询所述端口映射表以确定与所述第二虚拟机对应的第一虚 拟端口并确定与所述第二虚拟机对应的第一共享内存的地址; 通过所述 I/O设 备所对应的所述第二虚拟端口向所述 I/O设备发送携带有所述第一共享内存的 地址的回复消息, 以便于所述 I/O设备根据所述回复消息将所述待交换数据写 入所述第一共享内存; 在所述第一虚拟机接收到所述 I/O设备发送的用于向所 述第一虚拟机指示所述 I/O设备已完成将所述待交换数据写入所述第一共享内 存的写完中断时, 确定携带有读取指令的所述第二消息, 通过所述第一虚拟端 口向所述第二虚拟机发送所述第二消息,以便于所述第二虚拟机从所述第一共 享内存读取所述待交换数据。 结合第一方面及其上述实现方式,在其第五种实现方式中, 所述至少一个
VM还包括第三虚拟机, 当所述源节点为所述第二虚拟机, 所述目标节点为所 述第三虚拟机时, 所述第一虚拟机接收源节点发送的第一消息, 包括: 所述第 一虚拟机通过所述第一虚拟端口接收所述第二虚拟机发送的所述第一消息,所 述第一消息包括用于向所述第一虚拟机指示所述第二虚拟机已完成将所述待 交换数据写入所述第二虚拟机与所述第三虚拟机通过所述第一虚拟机预先协 商的第二共享内存的写完中断,其中所述第二共享内存为所述硬件层的存储设 备上的指定存储区域;所述第一虚拟机根据所述待交换数据携带的目标节点的 地址和配置的端口映射表确定第二消息并发送所述第二消息, 包括: 所述第一 虚拟机根据用于接收所述第一消息的所述第一虚拟端口确定与所述第一虚拟 端口对应的所述第二虚拟机的地址;根据所述第二虚拟机的地址和所述待交换 数据携带的第三虚拟机的地址确定所述第二共享内存的地址;确定携带有所述 第二共享内存的地址和读取指令的所述第二消息,并向所述第三虚拟机发送所 述第二消息, 以便于所述第三虚拟机从所述第二共享内存读取所述待交换数 据。 结合第一方面及其上述实现方式,在其第六种实现方式中, 所述方法还包 括: 接收所述目标节点发送的读完指示信息, 以便于所述第一共享内存或所述 第二共享内存被释放。 结合第一方面及其上述实现方式,在其第七种实现方式中, 所述第一虚拟 机接收源节点发送的第一消息之后,还包括: 所述第一虚拟机根据所述待交换 数据携带的目标节点的地址, 在配置的开放流 Openflow流表中确定与所述目 标节点的地址所匹配的表项, 其中, 所述 Openflow流表中包括至少一个表项, 所述表项包括地址、 虚拟端口和执行动作参数; 如果所述匹配的表项存在, 所 述第一虚拟机根据所述匹配的表项中与所述目标节点的地址所对应的执行动 作参数处理所述待交换数据; 如果所述匹配的表项不存在, 所述第一虚拟机建 立能够与所述待交换数据匹配的新表项, 并在所述 Openflow流表中插入所述 新表项。
第二方面, 提供了一种宿主机, 其特征在于, 包括: 创建模块, 用于在输 入 /输出 I/O设备的 I/O虚拟功能启动后, 在宿主机 Host之上产生至少一个虚 拟机 VM, 其中所述至少一个 VM包括具有虚拟交换功能的第一虚拟机, 所述 至少一个 VM还包括第二虚拟机; 配置模块,用于向所述第一虚拟机发送配置 命令,以便于所述第一虚拟机根据所述配置命令配置用于与所述第二虚拟机进 行通信的所述第一虚拟机的第一虚拟端口,并配置用于与所述 I/O设备进行通 信的所述第一虚拟机的第二虚拟端口。 第三方面, 提供了一种计算节点, 其特征在于, 运行在宿主机 Host之上, 所述 Host运行在硬件层之上,所述硬件层包括输入 /输出 I/O设备和存储设备, 所述虚拟机包括: 接收模块, 用于接收源节点发送的第一消息, 所述第一消息 用于请求所述虚拟机对待交换数据进行交换处理,其中所述待交换数据是从所 述源节点发往目标节点的,所述源节点和所述目标节点中的至少一个为第二虚 拟机, 所述第二虚拟机运行在所述 Host之上; 交换处理模块, 用于根据所述 待交换数据携带的目标节点的地址和所述虚拟机配置的端口映射表确定第二 消息,所述第二消息用于指示所述目标节点从所述硬件层的存储设备获取所述 待交换数据; 发送模块, 用于向所述目标节点发送所述第二消息。 结合第三方面, 在其第一种实现方式中, 其特征在于, 包括: 代理 Agent 模块, 用于根据所述 Host发送的配置命令, 配置用于与所述第二虚拟机进行 通信的所述虚拟机的第一虚拟端口, 并配置用于与所述 I/O设备进行通信的所 述虚拟机的第二虚拟端口; 生成模块, 用于建立所述第一虚拟端口与所述第二 虚拟端口之间的映射关系, 以生成所述端口映射表。 结合第三方面及其上述实现方式, 在其第二种实现方式中, 其特征在于, 所述 Agent模块 ,还用于根据所述配置命令配置所述第二虚拟机对应的第一共 享内存, 其中所述第一共享内存为所述硬件层的存储设备上的指定存储区域。 结合第三方面及其上述实现方式,在其第三种实现方式中,所述接收模块, 具体用于通过所述第一虚拟端口接收所述第一消息,所述第一消息包括用于向 所述虚拟机指示所述源节点已完成将所述待交换数据写入所述第一共享内存 的写完中断; 所述交换处理模块, 具体用于根据用于接收所述第一消息的所述 第一虚拟端口确定对应的所述第一共享内存的地址;从所述第一共享内存获取 所述待交换数据,根据所述待交换数据携带的所述目标节点的地址从所述端口 映射表中确定与所述目标节点对应的所述第二虚拟端口;确定携带有所述第一 共享内存的地址和读取指令的所述第二消息; 所述发送模块, 具体用于通过所 述第二虚拟端口向所述目标节点发送所述第二消息; 其中, 所述源节点为所述 第二虚拟机, 所述目标节点为所述 I/O设备。 结合第三方面及其上述实现方式,在其第四种实现方式中,所述接收模块, 具体用于接收源节点发送的所述第一消息; 所述交换处理模块,具体用于获取 所述待交换数据携带的目标节点的地址;根据所述目标节点的地址查询所述端 口映射表以确定与所述目标节点对应的第一虚拟端口并确定与所述目标节点 对应的第一共享内存的地址; 所述发送模块, 具体用于通过所述源节点所对应 的所述第二虚拟端口向所述源节点发送携带有所述第一共享内存的地址的回 复消息; 所述交换处理模块,还用于在接收到所述源节点发送的用于向所述虚 拟机指示所述源节点已完成将所述待交换数据写入所述第一共享内存的写完 中断时, 确定携带有读取指令的所述第二消息; 所述发送模块, 还用于通过所 述第一虚拟端口向所述目标节点发送所述第二消息; 所述接收模块,还用于接 收所述源节点发送的指示所述源节点已完成将所述待交换数据写入所述第一 共享内存的写完中断; 其中, 所述源节点为所述 I/O设备, 所述目标节点为所 述第二虚拟机。 结合第三方面及其上述实现方式,在其第五种实现方式中,所述接收模块, 具体用于通过所述第一虚拟端口接收所述源节点发送的所述第一消息,所述第 一消息包括写完中断; 所述交换处理模块, 具体用于根据用于接收所述第一消 息的所述第一虚拟端口确定所述第一虚拟端口对应的所述源节点的地址;根据 所述源节点的地址和所述待交换数据携带的目标节点的地址确定所述第二共 享内存的地址;确定携带有所述第二共享内存的地址和读取指令的所述第二消 息; 所述发送模块, 具体用于向所述目标节点发送所述第二消息; 其中, 所述 至少一个 VM还包括第三虚拟机,所述源节点为所述第二虚拟机,所述目标节 点为所述第三虚拟机。 第四方面, 提供了一种计算节点, 包括: 硬件层、 运行在所述硬件层之上 的宿主机 Host、 以及运行在所述 Host之上的至少一个虚拟机 VM, 其中, 所 述硬件层包括输入 /输出 I/O设备和存储设备, 所述至少一个虚拟机 VM包括 具有虚拟交换功能的第一虚拟机,所述至少一个 VM还包括第二虚拟机,其中: 所述第一虚拟机, 用于接收源节点发送的第一消息, 所述第一消息用于请求所 述第一虚拟机对待交换数据进行交换处理,其中所述待交换数据是从所述源节 点发往目标节点的,所述源节点和所述目标节点中的至少一个为所述第二虚拟 机; 所述第一虚拟机,还用于根据所述待交换数据携带的目标节点的地址和配 置的端口映射表确定第二消息并发送所述第二消息,所述第二消息用于指示所 述目标节点从所述硬件层的存储设备获取所述待交换数据。
结合第四方面, 在其第一种实现方式中, 所述 Host, 用于向所述第一虚 拟机发送配置命令; 所述第一虚拟机,还用于根据所述配置命令配置用于与所 述第二虚拟机进行通信的所述第一虚拟机的第一虚拟端口,并配置用于与所述 I/O设备进行通信的所述第一虚拟机的第二虚拟端口; 所述第一虚拟机, 还用 于建立所述第一虚拟端口与所述第二虚拟端口之间的映射关系,以生成所述端 口映射表。
结合第四方面及其上述实现方式,在其第二种实现方式中, 所述第一虚拟 机,还用于根据所述配置命令配置所述第二虚拟机对应的第一共享内存, 其中 所述第一共享内存为所述硬件层的存储设备上的指定存储区域。
结合第四方面及其上述实现方式, 在其第三种实现方式中, 所述源节点, 用于将所述待交换数据写入所述第一共享内存; 所述源节点,还用于向所述第 一虚拟机发送所述第一消息; 所述第一虚拟机, 具体用于通过所述第一虚拟端 口接收所述第一消息,所述第一消息包括用于向所述第一虚拟机指示所述源节 点已完成将所述待交换数据写入所述第一共享内存的写完中断;以及根据用于 接收所述第一消息的所述第一虚拟端口确定对应的所述第一共享内存的地址; 从所述第一共享内存获取所述待交换数据, 根据所述待交换数据携带的所述 I/O设备的地址从所述端口映射表中确定与所述 I/O设备对应的所述第二虚拟 端口; 确定携带有所述第一共享内存的地址和读取指令的所述第二消息, 并通 过所述第二虚拟端口向所述目标节点发送所述第二消息; 所述目标节点, 用于 根据所述第二消息从所述第一共享内存读取所述待交换数据; 其中, 所述源节 点为所述第二虚拟机, 所述目标节点为所述 I/O设备。
结合第四方面及其上述实现方式,在其第四种实现方式中,所述第一虚拟 机, 具体用于接收源节点发送的所述第一消息, 获取所述待交换数据携带的目 标节点的地址;根据所述目标节点的地址查询所述端口映射表以确定与所述目 标节点对应的第一虚拟端口并确定与所述目标节点对应的第一共享内存的地 址;通过所述源节点所对应的所述第二虚拟端口向所述源节点发送携带有所述 第一共享内存的地址的回复消息; 以及,在接收到所述源节点发送的用于向所 述第一虚拟机指示所述源节点已完成将所述待交换数据写入所述第一共享内 存的写完中断时, 确定携带有读取指令的所述第二消息, 通过所述第一虚拟端 口向所述目标节点发送所述第二消息; 所述源节点,还用于根据所述回复消息 中的所述第一共享内存的地址将所述待交换数据写入所述第一共享内存;所述 源节点,还用于向所述第一虚拟机发送指示所述源节点已完成将所述待交换数 据写入所述第一共享内存的写完中断; 所述目标节点, 用于根据所述第二消息 从所述第一共享内存读取所述待交换数据;其中,所述源节点为所述 I/O设备, 所述目标节点为所述第二虚拟机。 结合第四方面及其上述实现方式, 在其第五种实现方式中, 所述源节点, 还用于将所述待交换数据写入所述源节点与所述目标节点通过所述第一虚拟 机预先协商的第二共享内存,其中所述第二共享内存为所述硬件层的存储设备 上的指定存储区域; 所述源节点,还用于通过所述第一虚拟端口向所述第一虚 拟机发送所述第一消息, 所述第一消息包括写完中断; 所述第一虚拟机, 具体 用于根据用于接收所述第一消息的所述第一虚拟端口确定所述第一虚拟端口 对应的所述源节点的地址;根据所述源节点的地址和所述待交换数据携带的目 标节点的地址确定所述第二共享内存的地址;确定携带有所述第二共享内存的 地址和读取指令的所述第二消息, 并向所述目标节点发送所述第二消息; 所述 目标节点, 用于根据所述第二消息从所述第二共享内存读取所述待交换数据; 其中, 所述至少一个 VM还包括第三虚拟机, 所述源节点为所述第二虚拟机, 所述目标节点为所述第三虚拟机。 结合第四方面及其上述实现方式,在其第六种实现方式中, 所述目标节点 根据所述第二消息从所述共享内存读取所述待交换数据之后, 所述目标节点, 还用于向所述第一虚拟机发送读完指示信息,以便于所述第一共享内存或所述 第二共享内存被释放; 所述第一虚拟机,还用于释放所述第一共享内存或所述 第二共享内存。 结合第四方面及其上述实现方式,在其第七种实现方式中,在接收源节点 发送的第一消息之后, 所述第一虚拟机,还用于根据所述待交换数据携带的目 标节点的地址, 在配置的开放流 Openflow流表中确定与所述目标节点的地址 所匹配的表项, 其中, 所述 Openflow流表中包括至少一个表项, 所述表项包 括地址、 虚拟端口和执行动作参数; 如果所述匹配的表项存在, 根据所述匹配 的表项中与所述目标节点的地址所对应的执行动作参数处理所述待交换数据; 如果所述匹配的表项不存在, 建立能够与所述待交换数据匹配的新表项, 并在 所述 Openflow流表中插入所述新表项。
第五方面, 提供了一种计算机系统, 包括: 至少一个如第四方面所述的计 算节点。
由上可见, 本发明实施例中的计算节点包括: 硬件层、 运行在所述硬件层 之上的宿主机 Host、 以及运行在所述 Host之上的至少一个虚拟机 VM, 其中, 所述硬件层包括输入 /输出 I/O设备和存储设备, 所述至少一个虚拟机 VM包 括具有虚拟交换功能的第一虚拟机,所述至少一个 VM还包括第二虚拟机;如 此,将虚拟交换功能实现在虚拟机上,使得虚拟交换机与普通 VM处于同等优 先级, 形成对等的网络虚拟化架构, 在进行资源分配时虚拟交换机和普通 VM 一样使用用户空间的物理资源, 这样便于 Host进行管理和高效合理地进行带 宽、 CPU、 存储等资源的分配。 应用于该计算节点上的虚拟交换方法包括: 所 述第一虚拟机接收源节点发送的第一消息,所述第一消息用于请求所述第一虚 拟机对待交换数据进行交换处理,其中所述待交换数据从所述源节点发往目标 节点, 所述源节点和所述目标节点中的至少一个为所述第二虚拟机; 所述第一 虚拟机根据所述待交换数据携带的目标节点的地址和所述配置的端口映射表 确定第二消息并发送所述第二消息,所述第二消息用于指示所述目标节点从所 述硬件层的存储设备获取所述待交换数据。 该方法将虚拟交换功能从 Host内 核中剥离解耦, 降低与 Host的耦合性, 可以在同一 Host内部署多个 vSwitch, 不受 Host约束, 因此具有更强的扩展性, 并且解耦后 vSwtich不再依赖 Host 内核中的操作系统, 变得更加易于部署, 所以获得了更好的移植性, 并且由于 配置模块(agent ) 与待交换数据交换转发模块(端口映射表)相分离, 更符 合软件定义网络的要求。
附图说明 为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所 需要使用的附图作筒单地介绍, 显而易见地, 下面所描述的附图仅仅是本发明 的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。 图 1是现有技术中 OVS的架构图。
图 2是本发明一个实施例的虚拟化软硬件体系架构示意图。 图 3是本发明一个实施例的虚拟交换方法的流程图。 图 4是本发明一个实施例的虚拟交换数据流的示意图。 图 5是本发明另一实施例的虚拟交换数据流的示意图。 图 6是本发明另一实施例的虚拟交换数据流的示意图。 图 7是本发明另一实施例的用于软件定义网络 SDN的虚拟交换设备的示 意图。 图 8是本发明另一实施例的分布式实施的示意图。 图 9是本发明另一实施例的分布式实施的流程图。 图 10是本发明一个实施例的宿主机的模块架构示意图。 图 11是本发明一个实施例的虚拟机的模块架构示意图。
图 12是本发明一个实施例的计算机节点的示意图。 图 13是本发明一个实施例的计算机系统的示意图。
具体实施方式 下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例是本发明的一部分实施例, 而不是全 部实施例。基于本发明中的实施例, 本领域普通技术人员在没有做出创造性劳 动的前提下所获得的所有其他实施例, 都应属于本发明保护的范围。
为了方便理解本发明实施例,首先在此介绍本发明实施例描述中会引入的 几个术语;
虚拟机 VM: 通过虚拟机软件可以在一台物理计算机上模拟出一台或者多台虚拟的计 算机, 而这些虚拟机就像真正的计算机那样进行工作,虚拟机上可以安装操作 系统和应用程序,虚拟机还可访问网络资源。对于在虚拟机中运行的应用程序 而言, 虚拟机就像是在真正的计算机中进行工作。
硬件层:
虚拟化环境运行的硬件平台。 其中, 硬件层可包括多种硬件, 例如某计算 节点的硬件层可包括 CPU和内存, 还可以包括网卡( Network Interface Card , NIC ), 存储器等等高速 /低速输入 /输出 (I/O , Input/Output )设备, 其中 NIC 为底层物理网卡, 以下筒称 Host NIC来区别于虚拟机的虚拟网卡 VM NIC。
宿主机(Host ): 作为管理层, 用以完成硬件资源的管理、 分配; 为虚拟机呈现虚拟硬件平 台; 实现虚拟机的调度和隔离。 其中, Host 可能是虚拟机监控器(VMM ); 或者, 有时 VMM和 1个特权虚拟机配合, 两者结合组成 Host。 其中, 虚拟 存、 虚拟磁盘、 虚拟网卡等等。 其中, 该虚拟磁盘可对应 Host的一个文件或 者一个逻辑块设备。 虚拟机则运行在 Host为其准备的虚拟硬件平台上, Host 上运行一个或多个虚拟机。
虚拟交换机 ( Virtual Switch, vS witch ):
虚拟交换机在 Host的控制下将虚拟机互相连接起来, 并且接入到物理网 络当中, 虚拟交换机就像真正的虚拟机那样工作, 现有的虚拟交换机在 Host 内核中实现, 处于虚拟网络的核心位置, 负担虚拟局域网 ( Virtual Local Area Network, VLAN )、 负载均衡 Load-balance、 隧道 Tunneling、 安全 Security、 链路汇聚控制协议 ( Link Aggregation Control Protocol , LACP )、 服务质量 ( Quality of Service, QoS )等等诸多功能。 共享内存:
操作系统进程间通信 ( Inter-Process Communication, IPC ) 的一种机制, 共享内存是进程间通信中最筒单的方式之一,共享内存允许两个或更多进程访 问同一块内存, 在网络虚拟化中, 共享内存允许两个或者更多的虚拟机、 虚拟 硬件访问同一块内存。 共享内存在各种进程间通信方式中具有最高效率。 零拷贝: 避免 CPU将数据从一块存储拷贝到另外一块存储的技术,通过减少或消 除关键通信路径影响速率的操作, 降低数据传输的开销,从而有效的提高通信 性能, 实现高速数据传输, 实现有 10直通, MMAP等方式 软件定义网络( Software Defined Network, SDN ):
SDN是新一代网络架构, 其核心技术开放流 Openflow通过将网络设备控 制面与数据面分离开来,从而实现了网络流量的灵活控制, 为核心网络以及应 用的创新提供了良好的平台。 图 2示出了本发明实施例中将 vSwitch部署到 VM中的虚拟化方案的软硬 件体系架构示意图, 该体系架构主要包括三个层次: 硬件层、 Host和虚拟机 ( VM )。 其中硬件层包括 I/O设备, 即物理网卡 NIC , 通过该 NIC可以与外 界其他 Host或网络进行通信, 硬件层还可以包括存储设备, 例如内存、 硬盘 等等。 Host运行在硬件层之上, 其中 Host可能是虚拟机监控器 (VMM ), 或 者, 有时 VMM和 1个特权虚拟机配合, 两者结合组成 Host, 图 2中示出的 为第二种情况, 然而这仅仅为一个示例, 本发明对此不作限定。 在 Host之上 运行的至少一个虚拟机 VM ,其中一个 VM为本发明中的具有虚拟交换功能的 虚拟机(第一虚拟机), 同时还可以有若干个普通的虚拟机(第二虚拟机、 第 三虚拟机等等)。
以该体系架构建立虚拟化网络环境的过程中, Host 中的配置管理模块 ( Config and Manage Module , CMM )可以向具有虚拟交换功能的第一虚拟机 (以下用 vSwitch代称)发送配置命令来进行虚拟网络环境的配置以及 vSwitch 的配置。 具体地, CMM可以通过 vSwitch中的配置代理模块 ( agent ) 来进行 配置, 包括端口映射表, VLAN表,访问控制列表(Access Control List, ACL ) 等的管理和配置。 其中该 Host中的配置管理模块可以通过 IPC (例如 IOCTL, NETLINK, SOCKET等 ) 与 vSwitch的 agent模块相连接, 从而可以将 Host 虚拟环境的配置传入 vSwitch, 具体可以包括 Host NIC、 VM的后端 BE、共享 内存、 DMA中断等配置信息, 使得 vSwitch获得虚拟环境信息, 从而建立相 应的虚拟网络环境。 具体地, 可以在 VM创建好后, 由配置管理模块为 VM创建虚拟 NIC接 口, 而后配置管理模块可以通过 agent模块协商 vSwitch与 Host NIC的通信机 制(通信方式)和端口映射,且协商 vSwitch与 VMM NIC之间的通信机制(通 信方式 )和端口映射, 还可以进一步地协商 vSwitch与 VMM NIC之间的共享 内存等。 其中, vSwitch与 Host NIC之间可以使用 10直通、 零拷贝等方式通 信, vSwitch与 VM之间可以使用共享内存、 前后端 FE/BE事件通道等技术进 行通信。 根据协商好的各项配置的对应关系建立表项, 以生成映射表, 例如将 VM的地址、该 VM所对应的 vSwitch的虚拟端口的端口号、该 VM与 vSwitch 之间协商的共享内存地址建立对应关系,以形成表项,其中该 VM为普通虚拟 机, 例如第二虚拟机。 该虚拟化网络环境搭建好后,进行数据交换时: 该第一虚拟机( vSwitch ), 用于接收源节点发送的第一消息,该第一消息用于请求该第一虚拟机对待交换 数据进行交换处理, 其中该待交换数据从该源节点发往目标节点, 该源节点和 该目标节点中的至少一个为该第二虚拟机; 该第一虚拟机,还用于根据该待交 换数据携带的目标节点的地址和该配置的端口映射表确定第二消息并发送所 述第二消息,该第二消息用于指示该目标节点从该硬件层的存储设备获取该待 交换数据。 从而通过 vSwitch的信令控制, 和交换处理, 实现了待交换数据的 转发。 如此, 虚拟交换功能从 Host内核中剥离解耦, 转而在虚拟机上实现虚 拟交换的功能, 筒化了 Host内核的设计和负担, 并且由于 VM具有灵活性和 很好的扩展性,从而使得 vSwitch以及整个虚拟网络的扩展性和灵活性都得到 了提高。 进一步地, 因为将虚拟交换功能实现在虚拟机上, 使得虚拟交换机与 普通 VM处于同等优先级,形成对等的网络虚拟化架构,在进行资源分配时虚 拟交换机和普通 VM—样使用用户空间的物理资源, 这样便于 Host进行管理 和高效合理地进行资源分配。 图 3是本发明一个实施例的虚拟交换方法的流程图。图 2的方法由具有虚 拟交换功能的虚拟机(下文筒称为第一虚拟机)执行。
301 , 第一虚拟机接收源节点发送的第一消息, 第一消息用于请求第一虚 拟机对待交换数据进行交换处理, 其中待交换数据从源节点发往目标节点, 源 节点和目标节点中的至少一个为第二虚拟机。 第一虚拟机为具有虚拟交换功能的虚拟机,与其他普通虚拟机处于同等地 位并运行在 Host之上。 其中源节点可以是该 Host上的普通虚拟机 VM, 应当 理解的是: 这里的普通虚拟机是相对于具有虚拟交换功能的虚拟机而言,也可 以是该 Host外部的虚拟机或物理机, 然而由于该 Host是通过 Host NIC与外 界进行通信的, 所以与该 Host外部的虚拟机或物理机的通信都筒化地描述成 与 Host NIC进行通信, 即源节点也可以是 Host NIC。 同样地, 目标节点也可 以是该 Host上的普通虚拟机 VM , 也可以是 Host NIC。
302, 第一虚拟机根据待交换数据携带的目标节点的地址和配置的端口映 射表确定第二消息并发送所述第二消息,第二消息用于指示目标节点从硬件层 的存储设备获取待交换数据。
应理解, 上述步骤 302中, 配置的端口映射表可以由第一虚拟机来进行配 置 ,包括虚拟化网络建立初期端口映射表的初始化配置以及虚拟化网络后期运 行时端口映射表的动态维护。 而第一虚拟机可以仅仅是配置命令的执行者, 而 配置命令可以由 Host或者网络维护人员配置。 本发明实施例通过将虚拟交换功能部署到虚拟机中, 筒化了 VMM, 有利 于 Host对虚拟网络进行管理并进行高效、 合理的网络资源分配。 可选地, 作为一个实施例, 步骤 301之前, 还包括: 接收 Host发送的配 置命令;根据配置命令配置用于与第二虚拟机进行通信的第一虚拟机的第一虚 拟端口, 并配置用于与 I/O设备进行通信的第一虚拟机的第二虚拟端口; 建立 第一虚拟端口与第二虚拟端口之间的映射关系, 以生成端口映射表。 可选地,作为另一个实施例, 第一虚拟机根据配置命令配置第二虚拟机对 应的第一共享内存, 其中第一共享内存为硬件层的存储设备上的指定存储区 域。 具体地, Host 中的配置管理模块可以通过 vSwitch 中的 agent模块协商 vSwitch与 Host NIC的通信机制 (通信方式 )和端口映射, 且协商 vSwitch与 VMM NIC之间的通信机制 (通信方式 )和端口映射, 可选地, 还可以进一步 地协商 vSwitch与 VMM NIC之间的共享内存等, 其中共享内存为硬件层的存 储设备上的指定存储区域。 而后可以将协商好的各项配置的对应关系建立表 项, 生成端口映射表, 例如, 将 VM的地址、 该 VM所对应的 vSwitch的端口 号、 该 VM与 vSwitch之间协商的共享内存地址建立对应的关系, 生成端口映 射表的表项。在进行虚拟交换时, 第一虚拟机从第一虚拟机的第一虚拟端口接 收待交换数据, 其中第一虚拟端口对应于源节点; 通过第一虚拟机的第二虚拟 端口向目标节点发送待交换数据,其中第二虚拟端口是第一虚拟机根据第一虚 拟端口和预先配置的端口映射表确定的。 上述从第一虚拟端口接收待交换数 据,并通过第二虚拟端口向目标节点发送待交换数据的过程为第一虚拟机的逻 辑交换过程。 其中, 第一虚拟机与源节点通信的第一虚拟端口, 第一虚拟机与 目标节点通信的第二虚拟端口都是预先协商并配置好的。 可选地, 作为另一个实施例, 当源节点为第二虚拟机, 目标节点为 I/O设 备时, 第一虚拟机接收源节点发送的第一消息, 包括: 第一虚拟机通过第一虚 拟端口接收第二虚拟机发送的第一消息,第一消息包括用于向第一虚拟机指示 第二虚拟机已完成将待交换数据写入共享内存的写完中断;第一虚拟机根据用 于接收第一消息的第一虚拟端口确定对应的第一共享内存的地址;从第一共享 内存获取待交换数据,根据待交换数据携带的 I/O设备的地址从端口映射表中 确定与 I/O设备对应的第二虚拟端口; 确定携带有第一共享内存的地址和读取 指令的第二消息, 并通过第二虚拟端口向 I/O设备发送第二消息, 以便于 I/O 设备从第一共享内存读取待交换数据。 具体地,作为源节点的 Host中的第二虚拟机与第一虚拟端口建立虚连接, 其中第一虚拟端口是第一虚拟机预先配置的与该第二虚拟机对应的虚拟端口。 第二虚拟机向第一虚拟端口发送待交换数据,该待交换数据实际写入该第二虚 拟机与第一虚拟机预先协商的共享内存中。 写入完毕后, 第二虚拟机向第一虚 拟机发送写完指示信息, 第一虚拟机查询内部与配置的端口映射表, 以确定第 二虚拟端口以及与第二虚拟端口对应的主机网卡 Host NIC ,通过第二虚拟端口 向 Host NIC发送读取指示信息,令 Host NIC从共享内存中读取该待交换数据, 以便于 Host NIC进一步向 Host外部的目标节点发送该待交换数据。 应理解, 在第二虚拟机向 Host外部发送待交换数据的过程中, 目标节点也可以理解为 Host NIC。
可选地, 作为另一个实施例, 当源节点为 I/O设备, 目标节点为第二虚拟 机时, 第一虚拟机接收源节点发送的第一消息之后还包括: 第一虚拟机接收源 节点发送的第一消息之后还包括: 第一虚拟机从 I/O设备获取待交换数据携带 的目标节点的地址, 目标节点的地址为第二虚拟机的地址; 第一虚拟机根据待 交换数据携带的目标节点的地址和配置的端口映射表确定第二消息并发送所 述第二消息, 包括: 第一虚拟机根据第二虚拟机的地址查询端口映射表以确定 与第二虚拟机对应的第一虚拟端口并确定与第二虚拟机对应的第一共享内存 的地址; 通过 I/O设备所对应的第二虚拟端口向 I/O设备发送携带有第一共享 内存的地址的回复消息, 以便于 I/O设备根据回复消息将待交换数据写入第一 共享内存; 在第一虚拟机接收到 I/O设备发送的用于向第一虚拟机指示 I/O设 备已完成将待交换数据写入第一共享内存的写完中断时,确定携带有读取指令 的第二消息,通过第一虚拟端口向第二虚拟机发送第二消息, 以便于第二虚拟 机从第一共享内存读取待交换数据。 具体地, 第一虚拟机从 I/O设备获取待交换数据携带的目标节点的地址, 是由第一虚拟机在接收到第一消息的通知后,得知 I/O设备(即底层物理网卡 ) 接收到了待交换数据,之后第一虚拟机则可以通过驱动层直接访问该待交换数 据以获取其携带的目标节点的地址。 可选地,作为另一个实施例, 至少一个 VM还包括第三虚拟机, 当源节点 为第二虚拟机, 目标节点为第三虚拟机时, 即源节点和目标节点均为 Host上 的普通 VM时, 第一虚拟机接收源节点发送的第一消息, 包括: 第一虚拟机通 过第一虚拟端口接收第二虚拟机发送的第一消息,第一消息包括用于向第一虚 拟机指示第二虚拟机已完成将待交换数据写入第二虚拟机与第三虚拟机通过 第一虚拟机预先协商的第二共享内存的写完中断,其中第二共享内存为硬件层 的存储设备上的指定存储区域;第一虚拟机根据待交换数据携带的目标节点的 地址和配置的端口映射表确定第二消息并发送所述第二消息, 包括: 第一虚拟 机根据用于接收第一消息的第一虚拟端口确定与第一虚拟端口对应的第二虚 拟机的地址;根据第二虚拟机的地址和待交换数据携带的第三虚拟机的地址确 定第二共享内存的地址;确定携带有第二共享内存的地址和读取指令的第二消 息, 并向第三虚拟机发送第二消息, 以便于第三虚拟机从第二共享内存读取待 交换数据。
其中,第二共享内存是第二虚拟机与第三虚拟机通过第一虚拟机进行协商 的, 具体可以通过 Xen的事件通道( Event Channel )进行协商。 可选地, 作为另一个实施例, 上述方法还包括: 接收目标节点发送的读完 指示信息, 以便于释放第一共享内存或者第二共享内存。 具体地, 目标节点在 读取完待交换数据后向第一虚拟机发送读完指示信息,第一虚拟机接收到读完 指示信息后, 恢复共享内存的可写权限, 即释放该共享内存。 应理解, 以上所述第一共享内存和第二共享内存仅仅为了区分,对本发明 不构成限定。第一共享内存和第二共享内存都是硬件层存储设备上指定的一部 分内存空间, 具有随机性和不确定性。 例如, 第一共享内存被释放后, 也可能 转而被分配作为第二共享内存, 在这种情况下, 第一共享内存和第二共享内存 对应相同的内存空间。 可选地, 作为另一个实施例, 在端口映射表为开放流 Openflow流表时, 第一虚拟机根据待交换数据携带的目标节点的地址, 在该 Openflow流表中确 定与目标节点的地址所匹配的表项, 其中, Openflow 流表中包括至少一个表 项, 表项包括地址、 虚拟端口和执行动作参数; 如果匹配的表项存在, 第一虚 拟机根据匹配的表项中与目标节点的地址所对应的执行动作参数处理待交换 数据; 如果匹配的表项不存在, 第一虚拟机建立能够与待交换数据匹配的新表 项, 并在 Openflow流表中插入新表项。 由上可见, 本发明实施例中的计算节点包括: 硬件层、 运行在硬件层之上 的宿主机 Host、 以及运行在 Host之上的至少一个虚拟机 VM, 其中, 硬件层 包括输入 /输出 I/O设备和存储设备, 至少一个虚拟机 VM包括具有虚拟交换 功能的第一虚拟机, 至少一个 VM还包括第二虚拟机; 如此, 将虚拟交换功能 实现在虚拟机上,使得虚拟交换机与普通 VM处于同等优先级,形成对等的网 络虚拟化架构,在进行资源分配时虚拟交换机和普通 VM—样使用用户空间的 物理资源, 这样便于 Host进行管理和高效合理地进行资源分配。 应用于该计 算节点上的虚拟交换方法包括: 第一虚拟机接收源节点发送的第一消息, 第一 消息用于请求第一虚拟机对待交换数据进行交换处理,其中待交换数据从源节 点发往目标节点, 源节点和目标节点中的至少一个为第二虚拟机; 第一虚拟机 根据待交换数据携带的目标节点的地址和配置的端口映射表确定第二消息并 发送所述第二消息,第二消息用于指示目标节点从硬件层的存储设备获取待交 换数据。 该方法将虚拟交换功能从 Host内核中剥离解耦, 转而在虚拟机上实 现虚拟交换的功能, 筒化了 Host内核的设计和负担, 并且由于 VM具有灵活 性和很好的扩展性, 从而使得 vSwitch以及整个虚拟网络的扩展性和灵活性都 得到了提高, 便于控制面和数据面的分离, 使其满足 SDN 的需求, 支持 Openflow。
图 4是本发明一个实施例的虚拟交换数据流的示意图。如图 4所示,虚拟 交换机 vSwitch (虚拟交换功能)部署于第一虚拟机上, 使该第一虚拟机成为 虚拟交换设备, 并与普通的虚拟机 VM1、 VM2处于同等地位。 其中第一虚拟 机中的代理 Agent模块与主机 Host 中的配置管理模块 ( Config and Manage Module )连接, 以便于系统管理员对第一虚拟机进行配置。 第一虚拟机的虚拟 端口 port可以与 VM1、 VM2或者 VMM的底层物理网卡 HOST NIC进行连接。 以下通过数据流详细说明 Host中的普通 VM(例如 VM1 )向外界( HOST NIC ) 发送待交换数据的过程。 应理解, 图 4所示的系统架构仅仅为一个示例, 其中 VM、 port, Host NIC等模块的数量可以进行扩展。
401 , 预先配置。 在进行虚拟交换之前, 需要构建虚拟网络, 并对虚拟交换机 vSwitch (第 一虚拟机 )进行预先配置。具体可以通过 Host上的 Config and Manage Module 向第一虚拟机中的 Agent模块发送配置命令, 使得 Agent模块对 vSwitch的端 口映射、 VLAN管理等进行配置。 具体地,可以协商普通 VM与 vSwitch的通信方式、共享内存 Share Memory 和端口, 协商 vSwitch与 HOST NIC的通信方式和端口, 配置 vSwich的端口 映射, 以生成端口映射表。 其中, 通信方式可以包括共享内存、 10 直通、 零 拷贝或直接内存存取 ( Direct Memory Access, DMA )等。 共享内存是操作系 统进程间通信(IPC )的一种机制, 零拷贝为避免中央处理器 CPU将数据从一 块存储拷贝到另外一块存储的技术,其实现由 10直通、 MMAP等方式。其中, 作为更优选的实施例, 普通 VM与 vSwitch通过共享内存的方式进行通信, vSwitch与 Host NIC通过 10直通或 DMA方式进行通信, 可以使得本发明所 涉及的交换设备实现零拷贝, 从而降低了资源开销, 提高了交换效率。
402, 建立虚连接。 当 VM1需要向 Host外部( Host NIC )发送数据时, VM1首先与 vSwitch 的第一虚拟端口 portl建立虚连接, 其中 portl是步骤 401中 Agent模块预先 配置与 VM1 相对应的虚拟端口。 相应的物理过程为, VM1 通过其虚拟网卡 VM NIC映射到 VM1对应的共享内存。
403 , 写入待交换数据。 之后, VM1通过其 NIC向 portl发送待交换数据。 相应的实际物理过程 为, 将待交换数据写入 VM1对应的共享内存。 VM1写入完毕后, 通过 portl 向 vSwitch发送写完指示信息, 以通知 vSwitch进行下一步操作。 具体地, 该 写完指示信息可以是写完中断。
404, 交换处理过程。 vSwitch接收到 VM1发送的写完指示信息后, 转入交换处理过程, 查询 vSwitch内部的由 Agent模块配置的端口映射表, 以确定待交换数据的流出端 口 (第二虚拟端口 port2 ) 以及相对应的 Host NIC。 具体地, 端口映射表中存 有配置输入端口、输出端口、源地址、目标地址等信息的对应关系。从而 vSwitch 可以根据待交换数据中携带的目标地址和端口等信息可以确定输出端口,从而 完成交换处理过程。这里的输入 /输出端口信息可以是 vSwitch的虚拟端口的端 口号,源地址 /目标地址可以是源节点 /目标节点的互联网协议 IP地址或多媒体 访问控制 MAC地址
405 , 读取待交换数据。
确定 port2后, vSwitch通过 port2向 Host NIC发送读取指示信息,该读取 指示信息中可携带待交换数据存入的共享内存的地址,令其读取共享内存中的 待交换数据。 Host NIC读取数据完毕后, 可以向 Host外部连接的设备或节点 发送待交换数据, 并通过 port2向 vSwitch发送读完指示信息, 以便于 vSwitch 恢复共享内存的可写权限, 即释放该共享内存, 其中读完指示信息可以为读完 中断。 应理解, 为了方便描述, 本发明实施例中以待交换数据为例来说明虚拟交 换的具体过程, 事实上, 实际的虚拟交换还可以是数据流、 信令、 消息等, 本 发明对此不做限定。 由上可见, 本发明实施例将虚拟交换功能实现在虚拟机上,使得虚拟交换 机与普通 VM处于同等优先级,形成对等的网络虚拟化架构,在进行资源分配 时虚拟交换机和普通 VM—样使用用户空间的物理资源, 这样便于 Host进行 管理和高效合理地进行资源分配。 应用于该计算节点上的虚拟交换方法包括: 第一虚拟机接收源节点发送的第一消息,第一消息用于请求第一虚拟机对待交 换数据进行交换处理, 其中待交换数据从源节点发往目标节点, 源节点和目标 节点中的至少一个为第二虚拟机;第一虚拟机根据待交换数据携带的目标节点 的地址和配置的端口映射表确定第二消息并发送所述第二消息,第二消息用于 指示目标节点从硬件层的存储设备获取待交换数据。该方法将虚拟交换功能从 Host内核中剥离解耦, 转而在虚拟机上实现虚拟交换的功能, 筒化了 Host内 核的设计和负担,并且由于 VM具有灵活性和很好的扩展性,从而使得 vSwitch 以及整个虚拟网络的扩展性和灵活性都得到了提高。 图 5是本发明另一实施例的虚拟交换数据流的示意图。如图 5所示,虚拟 交换机 vSwitch (虚拟交换功能)部署于第一虚拟机上, 使该第一虚拟机成为 虚拟交换设备, 并与普通的虚拟机 VM1、 VM2处于同等地位。 其中第一虚拟 机中的代理 Agent模块与主机 Host 中的配置管理模块 ( Config and Manage Module )连接, 以便于系统管理员对第一虚拟机进行配置。 第一虚拟机的虚拟 端口 port可以与 VM1、 VM2或者 VMM的底层物理网卡 HOST NIC进行连接。 以下通过数据流详细说明由 Host外界( Host NIC ) 向 Host中的普通 VM (例 如 VM1 )发送待交换数据的过程。 应理解, 图 5所示的系统架构仅仅为一个 示例, 其中 VM、 port, Host NIC等模块的数量可以进行扩展。 501 , 预先配置。 在进行虚拟交换之前, 需要构建虚拟网络, 并对虚拟交换机 vSwitch (第 一虚拟机 )进行预先配置。 具体可以通过 Host上的 Config and Manage Module 向第一虚拟机中的 Agent模块发送配置命令, 使得 Agent模块对 vSwitch的端 口映射、 VLAN管理等进行配置。具体的配置过程和配置项目与上述图 3中步 骤 301相类似, 此处不再赘述。
502, 确定共享内存。
Host NIC接收到从外界(源节点)传入的待交换数据后, 查询目标节点 ( VM1 )的地址,并通过 portl向 vSwitch发送携带有 VM1的地址的请求信息, 其中 portl是步骤 501中 Agent模块预先配置与 Host NIC相对应的虚拟端口, 之后 vSwitch驱动层直接访问该待交换数据, 查询 vSwitch内部的由 Agent模 块预先配置的端口映射表, 以确定待交换数据的流出端口 (第二虚拟端口 port2 )以及相对应的共享内存。而后通过 portl向 Host NIC发送接待有共享内 存地址的回复消息。 503 , 写入待交换数据。
Host NIC接收到共享内存地址后,将待交换数据写入共享内存中。写入方 式由步骤 501中 Agent模块预先配置, 例如, 通过 DMA方式写入。 Host NIC 写入完毕后, 通过 portl向 vSwitch发送写完指示信息, 以通知 vSwitch进行 下一步操作, 其中写完指示信息可以为写完中断。 504, 读取待交换数据。 vSwitch收到写完指示信息后, 通过 port2向 VM1发送读取指示信息, 以 通知其新数据到来。 VM1从共享内存中读取待交换数据完毕后, 通过 port2向 vSwitch发送读完指示信息, 以便于 vSwitch恢复共享内存的可写权限, 即释 放该共享内存。 应理解, 为了方便描述, 本发明实施例中以待交换数据为例来说明虚拟交 换的具体过程, 事实上, 实际的虚拟交换还可以是数据流、 信令、 消息等, 本 发明对此不做限定。 由上可见, 本发明实施例将虚拟交换功能实现在虚拟机上,使得虚拟交换 机与普通 VM处于同等优先级,形成对等的网络虚拟化架构,在进行资源分配 时虚拟交换机和普通 VM—样使用用户空间的物理资源, 这样便于 Host进行 管理和高效合理地进行资源分配。 应用于该计算节点上的虚拟交换方法包括: 第一虚拟机接收源节点发送的第一消息,第一消息用于请求第一虚拟机对待交 换数据进行交换处理, 其中待交换数据从源节点发往目标节点, 源节点和目标 节点中的至少一个为第二虚拟机;第一虚拟机根据待交换数据携带的目标节点 的地址和配置的端口映射表确定第二消息并发送所述第二消息,第二消息用于 指示目标节点从硬件层的存储设备获取待交换数据。该方法将虚拟交换功能从 Host内核中剥离解耦, 转而在虚拟机上实现虚拟交换的功能, 筒化了 Host内 核的设计和负担,并且由于 VM具有灵活性和很好的扩展性,从而使得 vSwitch 以及整个虚拟网络的扩展性和灵活性都得到了提高。 图 6是本发明另一实施例的虚拟交换数据流的示意图。如图 6所示,虚拟 交换机 vSwitch (虚拟交换功能)部署于第一虚拟机上, 使该第一虚拟机成为 虚拟交换设备, 并与普通的虚拟机 VM1、 VM2处于同等地位。 其中第一虚拟 机中的代理 Agent模块与主机 Host 中的配置管理模块 ( Config and Manage Module )连接, 以便于系统管理员对第一虚拟机进行配置。 第一虚拟机的虚拟 端口 port可以与 VM1、 VM2或者 VMM的底层物理网卡 HOST NIC进行连接。 以下通过数据流详细说明 Host中普通 VM之间 ( VM1与 VM2 )待交换数据 发送的过程。 应理解, 图 6所示的系统架构仅仅为一个示例, 其中 VM、 port, Host NIC等模块的数量可以进行扩展。
601 , 预先配置。
在进行虚拟交换之前, 需要构建虚拟网络, 并对虚拟交换机 vSwitch (第 一虚拟机 )进行预先配置。 具体可以通过 Host上的 Config and Manage Module 向第一虚拟机中的 Agent模块发送配置命令, 使得 Agent模块对 vSwitch的端 口映射、 VLAN管理等进行配置。具体的配置过程和配置项目与上述图 3中步 骤 301相类似, 此处不再赘述。
602, 共享内存协商。
Host中普通 VM之间需要通过 vSwitch协商共享内存以供通信。 具体地, VM1可以通过 vSwitch与 VM2协商, 由 vSwitch创建一个共享内存以供 VM1 和 VM2共享。 具体协商过程可以利用 Xen事件通道 ( Event Channel )的机制 进行。 VM1与 vSwitch的第一虚拟端口 portl建立虚连接, 其中 portl是步骤 601中 Agent模块预先配置与 VM1相对应的虚拟端口。 相应的物理过程为, VM1通过其虚拟网卡 VM NIC映射到 VM1与 VM2协商的共享内存。
603 , 写入待交换数据。 之后, VM1通过其 NIC向 portl发送待交换数据。 相应的实际物理过程 为, 将待交换数据写入 VM1对应的共享内存。 VM1写入完毕后, 通过 portl 向 vSwitch发送写完指示信息, 以通知 vSwitch进行下一步操作。 604, 读取待交换数据。 vSwitch向 VM2发送读取指示信息, 令其读取共享内存中的待交换数据。 Host NIC 读取数据完毕后, 向 Host 外部的目标节点发送待交换数据, 并向 vSwitch发送读完指示信息, 以便于 vSwitch恢复共享内存的可写权限, 即释 放该共享内存。 应理解, 为了方便描述, 本发明实施例中以待交换数据为例来说明虚拟交 换的具体过程, 事实上, 实际的虚拟交换还可以是数据流、 信令、 消息等, 本 发明对此不做限定。 由上可见, 本发明实施例将虚拟交换功能实现在虚拟机上,使得虚拟交换 机与普通 VM处于同等优先级,形成对等的网络虚拟化架构,在进行资源分配 时虚拟交换机和普通 VM—样使用用户空间的物理资源, 这样便于 Host进行 管理和高效合理地进行资源分配。 应用于该计算节点上的虚拟交换方法包括: 第一虚拟机接收源节点发送的第一消息,第一消息用于请求第一虚拟机对待交 换数据进行交换处理, 其中待交换数据从源节点发往目标节点, 源节点和目标 节点中的至少一个为第二虚拟机;第一虚拟机根据待交换数据携带的目标节点 的地址和配置的端口映射表确定第二消息并发送所述第二消息,第二消息用于 指示目标节点从硬件层的存储设备获取待交换数据。该方法将虚拟交换功能从 Host内核中剥离解耦, 转而在虚拟机上实现虚拟交换的功能, 筒化了 Host内 核的设计和负担,并且由于 VM具有灵活性和很好的扩展性,从而使得 vSwitch 以及整个虚拟网络的扩展性和灵活性都得到了提高。 图 7是本发明另一实施例的用于软件定义网络 SDN的虚拟交换设备的示 意图。 本发明通过将虚拟交换机 vSwitch与 Host内核解耦,并将 vSwitch部署到 第一虚拟机中, 筒化了 Host内核的设计和复杂程度。 并且由于虚拟机的可配 置性、 扩展性和灵活性较高, 从而也提高了 vSwitch乃至整个虚拟化网络的扩 展性和灵活性, 使得本发明实施例的虚拟交换设备可以实现控制面 control plane和数据面 data plane的分离, 也就是说, 满足 SDN的需求。
SDN是新一代网络架构, 与传统网络架构将协议分层, 控制面和数据面 相融合的做法不同, SDN在操作和控制层面将协议融合处理, 并将控制面和 数据面分开。 典型的 SDN方案为开放流 Openflow, 具体到在本发明实施例的 具有虚拟交换功能的第一虚拟机上实现 Openflow, 可以将交换设备的逻辑实 现分为两个部分:开放流控制器( Openflow Controller )和开放流流表( Openflow Flowtable ), 其中开放流控制器负责控制面, 用于网络拓朴配置, 数据转发策 略调整, 配置和维护 Openflow流表, Openflow流表则负责数据面, 是数据流 转发的查询映射表。 为了满足 SDN架构对交换设备的需求, 本发明可采用如 下两种部署方式: 第一种, Openflow Controller和 Openflow Flowtable实现在同一个 VM中, 也就是本发明中的具有虚拟交换功能的第一虚拟机, 其中 Openflow Controller 实现在用户空间, 而 Flowtable可实现在用户空间, 亦可实现在内核空间; 第二种, Openflow Controller和 Openflow Flowtable分别实现在两个具有 虚拟交换功能的虚拟机中, 例如, 可以将 Openflow Controller部署在第一虚拟 机中, 运行在 Host之上的至少一个 VM之中还包括具有虚拟交换功能的第四 虚拟机,第四虚拟机与第一虚拟机相类似, 两者使用 VM间的通信技术交互信 息, 例如 Xen 的事件 Event Channel。 具体地, 如图 7所示, 虚拟交换机 vSwitch的 Controller和 FlowTable部 署于第一虚拟机上, 或者部署于不同的两个虚拟机上, 使得该 vSwitch与普通 的虚拟机 VM1、 VM2处于同等地位。 其中 Controller中的代理 Agent模块与 主机 Host中的配置管理模块 ( Config and Manage Module )连接, 以便于系统 管理员对 vS witch进行配置。 Flowtable部分的虚拟端口 port可以与 VM1、 VM2 或者 VMM的底层物理网卡 HOST NIC进行连接。应理解, 图 7所示的系统架 构仅仅为一个示例, 其中 VM、 port, Host NIC等模块的数量可以进行扩展。 Openflow Controller 和 Flowtable 相互配合实现业务流的转发, 其中
Controller包含用户配置数据库和一个规则库, Flowtable是一个以业务流为单 位的表结构, 包含匹配和执行和部分。 Flowtable每一个表项 entry代表一个业 务流, 匹配部分是待交换数据 IP、 MAC和 Port等字段, 执行部分表示对匹配 待交换数据的处理, 包括转发、 丟包和向 Controller申请新的表项。 例如, 每 当有待交换数据到达 vSwitch时, vSwitch检查待交换数据的 IP , Mac和 Port 等字段, 并搜索 Flowtable, 寻找匹配 entry; 若找到匹配表项, 按照 Action执 行操作; 若未找到匹配表项, Flowtable 向 Controller 发送表项建立请求, Controller收到请求后,查询规则库,建立新的表项,并发给 Flowtable; Flowtable 插入新的表项, 并将后续符合此表项的待交换数据按规则转发。 由上可见, 本发明实施例将虚拟交换功能实现在虚拟机上,使得虚拟交换 机与普通 VM处于同等优先级,形成对等的网络虚拟化架构,在进行资源分配 时虚拟交换机和普通 VM—样使用用户空间的物理资源, 这样便于 Host进行 管理和高效合理地进行资源分配。 该方法将虚拟交换功能从 Host内核中剥离 解耦, 降低 Host与 vSwitch的耦合性, 可以在同一 Host内部署多个 vSwitch 不受 Host的约束,并且由于 VM具有灵活性和很好的扩展性,从而使得 vSwitch 以及整个虚拟网络的扩展性和灵活性都得到了提高。本发明还将配置模块与待 交换数据交换转发模块相分离, 更加地符合可编程网络设计,从而能够在本发 明实施例的虚拟化网络架构上实现 SDN。 图 8是本发明另一实施例的分布式实施的示意图。 如图 8 所示。 本发明实施例的配置架构包括一个主虚拟交换机 Master vSwitch和两个个从属虚拟交换机 Slave vSwitch, 应理解的是, 图 8为了方便 描述, 仅仅示出两个从属 vSwitch, 这对本发明并不造成限定, 事实上可以有 若干个从属 vSwitch。图 8中的每一个主机 Host都与上述实施例中描述的运行 在硬件层之上的 Host相同, 且这些 Host可以是运行在同一个物理机的硬件层 之上的 Host, 也可以是运行在不同物理机的硬件层之上的 Host, 本发明对此 不做限定。 其中, 每一个 vSwitch均为本发明所涉及的具有虚拟交换功能的虚 拟机, 也就是说每一个 vSwitch都与上述实施例中的具有虚拟交换功能的第一 虚拟机相类似。 各个 Host中的主管理模块和从属管理模块都可以对应于上述 实施例 Host中的配置管理模块 Config and Manage Module, 相应地, Master vSwitch的控制管理模块被设定为主管理模块 Master Manager, Slave vSwitch 的控制管理模块被设定为从属管理模块 Slave Manager„ Master Manager 和 Slave Manager对其 Host的 vSwitch管理方式与上述各实施例的方式相同, 可 以通过 vSwitch中的 agent模块来配置和管理 vSwitch( agent未在图 8中示出)。 其中 Master Manager是用户配置的接口,可以由用户通过客户端程序直接进行 配置, Master Manager通过十办议与 Slave Manager通信, 十办商各个 vSwitch之 间的端口映射, 而 Master Manager与 Slave Manager之间的通信则为控制流, 主从 vSwitch之间的通信则为数据流。 具体地, 本发明实施例的分布式 vSwitch的配置过程为: 首先在一个 Host 上创建 Master vSwitch, 之后创建 vSwitch级联配置, 包括各个 Slave vSwitch, 以及所有 vSwitch上的 IP地址和端口映射; 之后通过配置协议将上述配置信 息发送到其他 Host, 至此, 承载 Master vSwitch的 Host为主 Host, 接收配置 信息的其他 Host为从属 Host; 之后, 接收到配置信息的各个从属 Host创建控 制管理模块, 即从属管理模块; 最后, 各个从属管理模块按照接收到的配置信 息配置器对应的 Slave vSwitch上的 IP地址和端口。 应理解, 本发明实施例所 涉及的配置协议,包括但不限于可扩展标记语言 XML、超文本传输协议 HTTP 等应用协议。 作为一个具体的例子, 本发明实施例的分布式交换架构的配置过程如图 9 所示: 901 , 用户登陆 HostO中的 Manage Module创建一个 vSwitch实例, 并将 其定义为 Master。
902,通过通信协议,将配置消息传输至 Hostl和 Host2的 Manage ModueL
903 , Hostl和 Host2的 Manage Module接收到配置消息,按照配置要求创 建 vSwitch实例,并定义为 Slave,然后将其 Master指针指向 HostO的 vSwitch; 才艮据配置中的端口映射, 配置其 vSwitch的端口映射。 本发明实施例将虚拟交换功能从 Host 内核中剥离解耦, 降低 Host 与 vSwitch的耦合性,可以在同一 Host内部署多个 vSwitch不受 Host的约束,并 且由于 vSwitch在用户操作系统 Guest OS中实现, 无需再依赖内核操作系统 Host OS/VMM OS, 使得 vSwitch非常容易部署, 具有良好的移植性, 从而使 得 vSwitch以及整个虚拟网络的扩展性和灵活性都得到了提高, 本发明实施例 中的分布式架构将多个 vSwitch级联, 使得虚拟网络得到大幅扩展以及虚拟交 换能力得到大幅提升。 图 10是本发明一个实施例的宿主机的模块架构示意图。 图 10的宿主机 1000包括创建模块 1001和配置模块 1002。 创建模块 1001 , 用于在输入 /输出 I/O设备的 I/O虚拟功能启动后, 在宿 主机 Host中产生至少一个虚拟机 VM , 其中至少一个 VM包括具有虚拟交换 功能的第一虚拟机, 至少一个 VM还包括第二虚拟机; 配置模块 1002 , 用于向第一虚拟机发送配置命令, 以便于第一虚拟机根 据配置命令配置用于与第二虚拟机进行通信的第一虚拟机的第一虚拟端口,并 配置用于与 I/O设备进行通信的第一虚拟机的第二虚拟端口。 可以理解的是, 本实施例宿主机 1000可如上述方法实施例中的 Host, 其 各个功能模块的功能可以根据上述方法实施例中的方法具体实现,其具体实现 过程可以参照上述方法实施例的相关描述, 此处不再赘述。 由上可见, 本实施例中在 I/O设备的 I/O虚拟功能启动后, HostlOOO通过 创建模块 1001产生至少一个运行在 HostlOOO之上的虚拟机。 具体地, 该创建 模块 1001可以是配置管理模块 ( Config and Manage Module ), 创建模块 1001 还可以通过使用 Qemu等工具创建虚拟机的虚拟网卡接口( VM NIC ) , 由创建 模块 1001 产生的虚拟机中有至少一个具有虚拟交换功能的第一虚拟机 vSwitch以及若干普通虚拟机 VM。 之后配置模块 1002 , 即 Config and Manage Module , 向 Agent模块发送配 置命令,其中配置模块 1002通过进程间通信技术 IPC (如 IOCTL, NETLINK, SOCKET等)与 Agent相连接, 配置模块 1002将 Host虚拟环境的配置传入第 一虚拟机的 Agent, 具体可以包括 Host 1000下层物理网卡、 虚拟机的前后端 FE/BE、 共享内存、 DMA 中断等配置信息, 使得第一虚拟机获得虚拟环境信 息, 从而建立相应的虚拟网络环境。 由上可见, 通过 HostlOOO 搭建的虚拟网络环境, 虚拟交换功能得以从 HostlOOO 内核中剥离解耦, 转而在第一虚拟机上实现虚拟交换的功能, 筒化 了 Host内核的设计和负担, 并且由于 VM具有灵活性和很好的扩展性, 从而 使得 vSwitch以及整个虚拟网络的扩展性和灵活性都得到了提高。 进一步地, 因为将虚拟交换功能实现在虚拟机上,使得虚拟交换机与普通 VM处于同等地 位, 具有相同的优先级, 形成对等的网络虚拟化架构, 在进行资源分配时虚拟 交换机和普通 VM—样使用用户空间的物理资源, 这样便于 Host进行管理和 高效合理地进行资源分配。 图 11 是本发明一个实施例的虚拟机的模块架构示意图。 图 11 的虚拟机 1100包括接收模块 1101、 交换处理模块 1102和发送模块 1103。 接收模块 1101 , 用于接收源节点发送的第一消息, 所述第一消息用于请 求所述虚拟机 1100对待交换数据进行交换处理, 其中所述待交换数据是从所 述源节点发往目标节点的,所述源节点和所述目标节点中的至少一个为第二虚 拟机, 所述第二虚拟机运行在所述 Host之上; 交换处理模块 1102, 用于根据所述待交换数据携带的目标节点的地址和 所述虚拟机 1100配置的端口映射表确定第二消息, 所述第二消息用于指示所 述目标节点从所述硬件层的存储设备获取所述待交换数据; 发送模块 1103 , 用于向所述目标节点发送所述第二消息。 本发明实施例的虚拟机 1100为具有虚拟交换功能的虚拟机, 与其他普通 虚拟机具有同等地位, 部署于 Host上。 其中源节点可以是 Host上的普通虚拟 机, 也可以是 Host外部的虚拟机或物理机。 同样地, 目标节点也可以是 Host 上的普通虚拟机, 也可以是 Host外部的虚拟机或物理机。 可以理解的是, 本发明实施例的虚拟机 1100可如上述方法实施例中的具 有虚拟交换功能的第一虚拟机,其各个功能模块的功能可以根据上述方法实施 例中的方法具体实现, 其具体实现过程可以参照上述方法实施例的相关描述, 此处不再赘述。 本发明实施例通过将虚拟交换功能部署到虚拟机中, 筒化了 VMM, 有利 于 Host对虚拟网络进行管理并进行高效、 合理的网络资源分配。 可选地, 作为一个实施例, 虚拟机 1100还包括代理 Agent模块 1104和生 成模块 1105。具体地,代理 Agent模块 1104,用于根据 Host发送的配置命令, 配置用于与第二虚拟机进行通信的虚拟机的第一虚拟端口 1106, 并配置用于 与 I/O设备进行通信的虚拟机的第二虚拟端口 1107。 生成模块 1105 , 用于建 立第一虚拟端口 1106与第二虚拟端口 1107之间的映射关系,以生成端口映射 表。
可选地, 作为一个实施例, Agent模块 1104, 还用于根据配置命令配置第 二虚拟机对应的第一共享内存,其中第一共享内存为硬件层的存储设备上的指 定存储区域。 具体可以通过第二虚拟机与虚拟机 1100之间的事件通道协商第 一共享内存。接收模块 1101 ,具体用于通过第一虚拟端口 1106接收第一消息, 第一消息包括用于向虚拟机 1100指示源节点已完成将待交换数据写入第一共 享内存的写完中断; 交换处理模块 1102 , 具体用于根据用于接收第一消息的 第一虚拟端口 1106确定对应的第一共享内存的地址; 从第一共享内存获取待 交换数据携带的目标节点的地址,以便于确定目标节点所对应的第二虚拟端口 1107;确定携带有第一共享内存的地址和读取指令的第二消息。发送模块 1103 , 具体用于通过端口映射表中与第一虚拟端口 1106对应的第二虚拟端口 1107向 目标节点发送第二消息; 其中, 源节点为第二虚拟机, 目标节点为 I/O设备。 可选地, 作为一个实施例, 接收模块 1101 , 具体用于接收源节点发送的 第一消息; 交换处理模块 1102, 具体用于获取待交换数据携带的目标节点的 地址;根据目标节点的地址查询端口映射表以确定与目标节点对应的第一虚拟 端口 1106并确定与第二虚拟机对应的第一共享内存的地址; 发送模块 1103 , 具体用于通过 I/O设备所对应的第二虚拟端口 1107向目标节点发送携带有第 一共享内存的地址的回复消息; 交换处理模块 1102, 还用于在接收到源节点 发送的用于向虚拟机 1100指示源节点已完成将待交换数据写入第一共享内存 的写完中断时, 确定携带有读取指令的第二消息; 发送模块 1103 , 还用于通 过第一虚拟端口 1106向目标节点发送第二消息; 接收模块 1101 , 还用于接收 源节点发送的指示源节点已完成将待交换数据写入第一共享内存的写完中断; 其中, 源节点为 I/O设备, 目标节点为第二虚拟机。 可选地, 作为一种实现方式, 接收模块 1101 , 具体用于通过第一虚拟端 口 1106接收源节点发送的第一消息, 第一消息包括写完中断; 交换处理模块 1102, 具体用于根据用于接收第一消息的第一虚拟端口 1106确定对应的源节 点的地址;根据源节点的地址和待交换数据携带的目标节点的地址确定第二共 享内存的地址; 确定携带有第二共享内存的地址和读取指令的第二消息; 发送 模块 1103 , 具体用于向目标节点发送第二消息。 可选地, 作为另一个实施例, 接收模块 1101还用于: 接收目标节点发送 的读完指示信息, 以便于虚拟机 1100释放第一共享内存或第二共享内存。
具体地, 第一虚拟机从 I/O设备获取待交换数据携带的目标节点的地址, 是由第一虚拟机在接收到第一消息的通知后,得知 I/O设备(即底层物理网卡 ) 接收到了待交换数据,之后第一虚拟机则可以通过驱动层直接访问该待交换数 据以获取其携带的目标节点的地址。
可选地, 在一种实现方式下, 在端口映射表为开放流 Openflow流表时, 第一虚拟机 1231还包括包含 Agent模块 1104的 Openflow控制器, 其中: 在 接收模块 1101接收源节点发送的第一消息之后,,交换处理模块 1102还用于, 根据待交换数据携带的目标节点的地址, 在 Openflow流表中确定与目标节点 的地址所匹配的表项, 其中, Openflow 流表中包括至少一个表项, 表项包括 地址、 虚拟端口和执行动作参数; 如果匹配的表项存在, 根据匹配的表项中与 目标节点的地址所对应的执行动作参数处理待交换数据;如果匹配的表项不存 在, 向 Openflow控制器发送表项建立请求, 以便于 Openflow控制器根据表项 建立请求建立能够与待交换数据匹配的新表项, 并在 Openflow流表中插入新 表项。
本发明实施例通过将虚拟交换功能部署到虚拟机 1100中, 使得具有虚拟 交换功能的虚拟机 1100与其他普通虚拟机处于同等地位,从而有利于 Host对 虚拟网络进行管理并进行高效、合理的网络资源分配。 并且由于虚拟交换功能 从 Host核心中剥离, 从而增强了扩展性, 使其虚拟机 1100满足 SDN的需求, 支持 Openflow。 图 12是本发明一个实施例的计算机节点的示意图。图 12所示的计算节点 1200可包括: 硬件层 1210、 运行在硬件层 1210之上的宿主机 Host 1220、 以及运行在 Host 1220之上的至少一个虚拟机 VM1230; 其中, 硬件层 1210包括输入 /输出 I/O设备 1211和存储设备 1212 , 至少 一个虚拟机 VM1230 包括具有虚拟交换功能的第一虚拟机 1231 , 至少一个 VM1230还包括第二虚拟机 1232。
第一虚拟机 1231 , 用于接收源节点发送的第一消息, 第一消息用于请求 第一虚拟机 1231对待交换数据进行交换处理, 其中待交换数据是从源节点发 往目标节点的, 源节点和目标节点中的至少一个为第二虚拟机 1232; 第一虚拟机 1231 , 还用于根据待交换数据携带的目标节点的地址和配置 的端口映射表确定第二消息并发送所述第二消息,第二消息用于指示目标节点 从硬件层的存储设备 1212获取待交换数据。 此外, Hostl220, 用于向第一虚拟机 1231发送配置命令; 第一虚拟机 1231 , 还用于根据配置命令, 通过第一虚拟机的代理 Agent 模块配置用于与第二虚拟机进行通信的第一虚拟机的第一虚拟端口,并配置用 于与 I/O设备 1211进行通信的第一虚拟机的第二虚拟端口; 第一虚拟机 1231 , 还用于建立第一虚拟端口与第二虚拟端口之间的映射 关系, 以生成端口映射表。 可选地, 第一虚拟机 1231 , 还用于根据配置命令配置第二虚拟机 1232对 应的第一共享内存,其中第一共享内存为硬件层 1210的存储设备 1212上的指 定存储区域。
具体地, 作为一个数据流和信令流交互的例子, 当源节点为第二虚拟机 1232, 目标节点为 I/O设备 1211时: 源节点 1232 , 用于将待交换数据写入第一共享内存; 源节点 1232 , 还用于向第一虚拟机 1231发送第一消息; 第一虚拟机 1231 , 具体用于通过第一虚拟端口接收第一消息, 第一消息 包括用于向第一虚拟机 1231指示源节点 1232已完成将待交换数据写入第一共 享内存的写完中断;以及根据用于接收第一消息的第一虚拟端口确定对应的第 一共享内存的地址; 从第一共享内存获取待交换数据携带的目标节点 1211的 地址, 以便于确定目标节点 1211所对应的第二虚拟端口; 确定携带有第一共 享内存的地址和读取指令的第二消息,并通过端口映射表中与第一虚拟端口对 应的第二虚拟端口向目标节点 1211发送第二消息; 目标节点 1211 , 用于根据第二消息从第一共享内存读取待交换数据; 具体地,作为一个数据流和信令流交互的例子,当源节点为 I/O设备 1211 , 目标节点为第二虚拟机 1232时: 第一虚拟机 1231 , 具体用于接收源节点 1211发送的第一消息, 获取待交 换数据携带的目标节点 1232的地址;根据目标节点 1232的地址查询端口映射 表以确定与目标节点 1232对应的第一虚拟端口并确定与第二虚拟机 1232对应 的第一共享内存的地址; 通过 I/O设备 1211所对应的第二虚拟端口向目标节 点 1232发送携带有第一共享内存的地址的回复消息; 以及, 在接收到源节点 1211发送的用于向第一虚拟机指示源节点 1211已完成将待交换数据写入第一 共享内存的写完中断时, 确定携带有读取指令的第二消息,通过第一虚拟端口 向目标节点 1232发送第二消息; 源节点 1211 , 还用于根据回复消息中的第一共享内存的地址将待交换数 据写入第一共享内存; 源节点 1211 , 还用于向第一虚拟机发送指示源节点 1211已完成将待交换 数据写入第一共享内存的写完中断; 目标节点 1232, 用于根据第二消息从第一共享内存读取待交换数据; 具体地,作为一个数据流和信令流交互的例子, 当源节点和目标节点同为 至少一个 VM1230中的普通虚拟机时, 假设源节点为第二虚拟机 1232, 目标 节点为第三虚拟机 1233: 源节点 1232 , 还用于将待交换数据写入源节点 1232与目标节点 1233通 过第一虚拟机 1231 预先协商的第二共享内存, 其中第二共享内存为硬件层 1210的存储设备 1212上的指定存储区域; 源节点 1232, 还用于通过第一虚拟端口向第一虚拟机发送第一消息, 第 一消息包括写完中断; 第一虚拟机 1231 , 具体用于根据用于接收第一消息的第一虚拟端口确定 对应的源节点 1232的地址;根据源节点 1232的地址和待交换数据携带的目标 节点 1233的地址确定第二共享内存的地址; 确定携带有第二共享内存的地址 和读取指令的第二消息, 并向目标节点 1233发送第二消息; 目标节点 1233 , 用于根据第二消息从第二共享内存读取待交换数据。 可选地,作为一个实施例, 在目标节点根据第二消息从共享内存读取待交 换数据之后, 目标节点可以向第一虚拟机 1231发送读完指示信息, 以便于第 一共享内存或第二共享内存被释放; 第一虚拟机 1231 , 在接收到该读完指示 信息后, 释放第一共享内存或第二共享内存。 具体地, 第一虚拟机从 I/O设备获取待交换数据携带的目标节点的地址, 是由第一虚拟机在接收到第一消息的通知后,得知 I/O设备(即底层物理网卡 ) 接收到了待交换数据,之后第一虚拟机则可以通过驱动层直接访问该待交换数 据以获取其携带的目标节点的地址。 可选地, 在端口映射表为开放流 Openflow流表时, 在接收源节点发送的 第一消息之后, 第一虚拟机 1231还用于: 根据待交换数据携带的目标节点的 地址, 配置的 Openflow流表中确定与目标节点的地址所匹配的表项, 其中, Openflow 流表中包括至少一个表项, 表项包括地址、 虚拟端口和执行动作参 数; 如果匹配的表项存在,根据匹配的表项中与目标节点的地址所对应的执行 动作参数处理待交换数据; 如果匹配的表项不存在, 建立能够与待交换数据匹 配的新表项, 并在该 Openflow流表中插入新表项。 综上, 本发明实施例中计算节点 1200可包括: 硬件层 1210、 运行在所述 硬件层 1210之上的宿主机 Hostl220、以及运行在所述 Hostl220之上的至少一 个虚拟机 VM1230, 其中, 所述硬件层包括输入 /输出 I/O设备 1211和存储设 备 1212,所述至少一个虚拟机 VM包括具有虚拟交换功能的第一虚拟机 1231 , 所述至少一个 VM还包括第二虚拟机 1232; 如此, 将虚拟交换功能实现在虚 拟机上,使得虚拟交换机与普通 VM处于同等优先级,形成对等的网络虚拟化 架构, 在进行资源分配时虚拟交换机和普通 VM —样使用用户空间的物理资 源, 这样便于 Host进行管理和高效合理地进行带宽、 CPU、 存储等资源的分 配。应用于该计算节点上的虚拟交换方法包括: 所述第一虚拟机接收源节点发 送的第一消息,所述第一消息用于请求所述第一虚拟机对待交换数据进行交换 处理, 其中所述待交换数据从所述源节点发往目标节点, 所述源节点和所述目 标节点中的至少一个为所述第二虚拟机;所述第一虚拟机根据所述待交换数据 携带的目标节点的地址和所述配置的端口映射表确定第二消息并发送所述第 二消息,所述第二消息用于指示所述目标节点从所述硬件层的存储设备获取所 述待交换数据。 该方法将虚拟交换功能从 Host内核中剥离解耦, 降低与 Host 的耦合性, 可以在同一 Host内部署多个 vSwitch, 不受 Host约束, 因此具有 更强的扩展性, 并且解耦后 vSwtich不再依赖 Host内核中的操作系统, 变得 更加易于部署, 所以获得了更好的移植性, 并且由于配置模块(Agent )与待 交换数据交换转发模块(端口映射表)相分离, 更符合软件定义网络的要求。
图 13是本发明一个实施例的计算机系统的示意图。 参见图 13 , 本发明实 施例还提供一种计算机系统 1300, 可包括:
至少一个计算节点 1200。
需要说明的是, 对于前述的各方法实施例, 为了筒单描述, 故将其都表述 为一系列的动作组合,但是本领域技术人员应该知悉, 本发明并不受所描述的 动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。 其次, 本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施 例, 所涉及的动作和模块并不一定是本发明所必须的。 在上述实施例中,对各个实施例的描述都各有侧重, 某个实施例中没有详 述的部分, 可以参见其他实施例的相关描述。
综上, 本发明实施例的计算机系统 1300中的计算节点 1200可包括: 硬件 层、 运行在所述硬件层之上的宿主机 Host、 以及运行在所述 Host之上的至少 一个虚拟机 VM, 其中, 所述硬件层包括输入 /输出 I/O设备和存储设备, 所述 至少一个虚拟机 VM包括具有虚拟交换功能的第一虚拟机,所述至少一个 VM 还包括第二虚拟机; 如此, 将虚拟交换功能实现在虚拟机上, 使得虚拟交换机 与普通 VM处于同等优先级,形成对等的网络虚拟化架构,在进行资源分配时 虚拟交换机和普通 VM—样使用用户空间的物理资源, 这样便于 Host进行管 理和高效合理地进行带宽、 CPU、 存储等资源的分配。 应用于该计算节点上的 虚拟交换方法包括: 所述第一虚拟机接收源节点发送的第一消息,所述第一消 息用于请求所述第一虚拟机对待交换数据进行交换处理,其中所述待交换数据 从所述源节点发往目标节点,所述源节点和所述目标节点中的至少一个为所述 第二虚拟机;所述第一虚拟机根据所述待交换数据携带的目标节点的地址和所 述配置的端口映射表确定第二消息并发送所述第二消息,所述第二消息用于指 示所述目标节点从所述硬件层的存储设备获取所述待交换数据。该方法将虚拟 交换功能从 Host内核中剥离解耦, 降低与 Host的耦合性, 可以在同一 Host 内部署多个 vSwitch, 不受 Host 约束, 因此具有更强的扩展性, 并且解耦后 vSwtich不再依赖 Host内核中的操作系统, 变得更加易于部署, 所以获得了更 好的移植性, 并且由于配置模块(Agent )与待交换数据交换转发模块(端口 映射表)相分离, 更符合软件定义网络的要求。 本领域普通技术人员可以意识到,结合本文中所公开的实施例中描述的各 方法步骤和单元, 能够以电子硬件、 计算机软件或者二者的结合来实现, 为了 清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述 了各实施例的步骤及组成。这些功能究竟以硬件还是软件方式来执行,取决于 技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的 应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明 的范围。 结合本文中所公开的实施例描述的方法或步骤可以用硬件、处理器执行的 软件程序, 或者二者的结合来实施。 软件程序可以置于随机存储器(RAM )、 内存、 只读存储器(R0M )、 电可编程 R0M、 电可擦除可编程 R0M、 寄存器、 硬盘、 可移动磁盘、 CD-ROM, 或技术领域内所公知的任意其它形式的存储介 质中。
本发明并不限于此。在不脱离本发明的精神和实质的前提下, 本领域普通技术 人员可以对本发明的实施例进行各种等效的修改或替换,而这些修改或替换都 应在本发明的涵盖范围内。

Claims

权 利 要 求
1. 一种虚拟交换方法, 其特征在于, 应用于计算节点上, 所述计算节点 包括: 硬件层、 运行在所述硬件层之上的宿主机 Host、 以及运行在所述 Host 之上的至少一个虚拟机 VM, 其中, 所述硬件层包括输入 /输出 I/O设备和存储 设备,所述至少一个虚拟机 VM包括具有虚拟交换功能的第一虚拟机,所述至 少一个 VM还包括第二虚拟机:
所述方法包括:
所述第一虚拟机接收源节点发送的第一消息,所述第一消息用于请求所述 第一虚拟机对待交换数据进行交换处理,其中所述待交换数据是从所述源节点 发往目标节点的, 所述源节点和所述目标节点中的至少一个为所述第二虚拟 机;
所述第一虚拟机根据所述待交换数据携带的目标节点的地址和配置的端 口映射表确定第二消息并发送所述第二消息,所述第二消息用于指示所述目标 节点从所述硬件层的存储设备获取所述待交换数据。
2. 根据权利要求 1 所述的方法, 其特征在于, 所述第一虚拟机接收源节 点发送的第一消息之前, 还包括:
所述第一虚拟机接收所述 Host发送的配置命令;
所述第一虚拟机根据所述配置命令配置用于与所述第二虚拟机进行通信 的所述第一虚拟机的第一虚拟端口, 并配置用于与所述 I/O设备进行通信的所 述第一虚拟机的第二虚拟端口;
所述第一虚拟机建立所述第一虚拟端口与所述第二虚拟端口之间的映射 关系, 以生成所述端口映射表。
3. 根据权利要求 2所述的方法, 其特征在于, 所述接收所述 Host发送的 配置命令之后,还包括: 所述第一虚拟机根据所述配置命令配置所述第二虚拟 机对应的第一共享内存,其中所述第一共享内存为所述硬件层的存储设备上的 指定存储区域。
4. 根据权利要求 3 所述的方法, 其特征在于, 当所述源节点为所述第二 虚拟机, 所述目标节点为所述 I/O设备时,
所述第一虚拟机接收源节点发送的第一消息, 包括: 所述第一虚拟机通过 所述第一虚拟端口接收所述第二虚拟机发送的所述第一消息 ,所述第一消息包 括用于向所述第一虚拟机指示所述第二虚拟机已完成将所述待交换数据写入 所述第一共享内存的写完中断;
所述第一虚拟机根据所述待交换数据携带的目标节点的地址和配置的端 口映射表确定第二消息并发送所述第二消息, 包括: 所述第一虚拟机根据用于 接收所述第一消息的所述第一虚拟端口确定对应的所述第一共享内存的地址; 从所述第一共享内存获取所述待交换数据, 根据所述待交换数据携带的所述 I/O设备的地址从所述端口映射表中确定与所述 I/O设备对应的所述第二虚拟 端口; 确定携带有所述第一共享内存的地址和读取指令的所述第二消息, 并通 过所述第二虚拟端口向所述 I/O设备发送所述第二消息, 以便于所述 I/O设备 从所述第一共享内存读取所述待交换数据。
5. 根据权利要求 3所述的方法,其特征在于, 当所述源节点为所述 I/O设 备, 所述目标节点为所述第二虚拟机时,
所述第一虚拟机接收源节点发送的第一消息之后还包括:所述第一虚拟机 从所述 I/O设备获取所述待交换数据携带的目标节点的地址,所述目标节点的 地址为所述第二虚拟机的地址;
所述第一虚拟机根据所述待交换数据携带的目标节点的地址和配置的端 口映射表确定第二消息并发送所述第二消息, 包括: 所述第一虚拟机根据所述 第二虚拟机的地址查询所述端口映射表以确定与所述第二虚拟机对应的第一 虚拟端口并确定与所述第二虚拟机对应的第一共享内存的地址; 通过所述 I/O 设备所对应的所述第二虚拟端口向所述 I/O设备发送携带有所述第一共享内存 的地址的回复消息, 以便于所述 I/O设备根据所述回复消息将所述待交换数据 写入所述第一共享内存;在所述第一虚拟机接收到所述 I/O设备发送的用于向 所述第一虚拟机指示所述 I/O设备已完成将所述待交换数据写入所述第一共享 内存的写完中断时,确定携带有读取指令的所述第二消息,通过所述第一虚拟 端口向所述第二虚拟机发送所述第二消息,以便于所述第二虚拟机从所述第一 共享内存读取所述待交换数据。
6. 根据权利要求 2所述的方法, 其特征在于, 所述至少一个 VM还包括 第三虚拟机, 当所述源节点为所述第二虚拟机, 所述目标节点为所述第三虚拟 机时,
所述第一虚拟机接收源节点发送的第一消息, 包括: 所述第一虚拟机通过 所述第一虚拟端口接收所述第二虚拟机发送的所述第一消息,所述第一消息包 括用于向所述第一虚拟机指示所述第二虚拟机已完成将所述待交换数据写入 所述第二虚拟机与所述第三虚拟机通过所述第一虚拟机预先协商的第二共享 内存的写完中断,其中所述第二共享内存为所述硬件层的存储设备上的指定存 储区域;
所述第一虚拟机根据所述待交换数据携带的目标节点的地址和配置的端 口映射表确定第二消息并发送所述第二消息, 包括: 所述第一虚拟机根据用于 接收所述第一消息的所述第一虚拟端口确定与所述第一虚拟端口对应的所述 第二虚拟机的地址;根据所述第二虚拟机的地址和所述待交换数据携带的第三 虚拟机的地址确定所述第二共享内存的地址;确定携带有所述第二共享内存的 地址和读取指令的所述第二消息, 并向所述第三虚拟机发送所述第二消息, 以 便于所述第三虚拟机从所述第二共享内存读取所述待交换数据。
7. 根据权利要求 4至 6中任意一项所述的方法, 其特征在于, 所述方法 还包括: 接收所述目标节点发送的读完指示信息, 以便于所述第一共享内存或 所述第二共享内存被释放。
8. 根据权利要求 1至 7中任意一项所述的方法, 其特征在于, 在所述端 口映射表为开放流 Openflow流表时, 所述第一虚拟机接收源节点发送的第一 消息之后, 还包括:
所述第一虚拟机根据所述待交换数据携带的目标节点的地址, 在所述 Openflow 流表中确定与所述目标节点的地址所匹配的表项, 其中, 所述 Openflow 流表中包括至少一个表项, 所述表项包括地址、 虚拟端口和执行动 作参数;
如果所述匹配的表项存在,所述第一虚拟机根据所述匹配的表项中与所述 目标节点的地址所对应的执行动作参数处理所述待交换数据;
如果所述匹配的表项不存在,所述第一虚拟机建立能够与所述待交换数据 匹配的新表项, 并在所述 Openflow流表中插入所述新表项。
9. 一种宿主机, 其特征在于, 包括:
创建模块, 用于在输入 /输出 I/O设备的 I/O虚拟功能启动后, 在宿主机 Host之上产生至少一个虚拟机 VM,其中所述至少一个 VM包括具有虚拟交换 功能的第一虚拟机, 所述至少一个 VM还包括第二虚拟机; 配置模块, 用于向所述第一虚拟机发送配置命令, 以便于所述第一虚拟机 根据所述配置命令配置用于与所述第二虚拟机进行通信的所述第一虚拟机的 第一虚拟端口,并配置用于与所述 I/O设备进行通信的所述第一虚拟机的第二 虚拟端口。
10. 一种虚拟机, 其特征在于, 运行在宿主机 Host之上, 所述 Host运行 在硬件层之上,所述硬件层包括输入 /输出 I/O设备和存储设备,所述虚拟机包 括:
接收模块, 用于接收源节点发送的第一消息, 所述第一消息用于请求所述 虚拟机对待交换数据进行交换处理,其中所述待交换数据是从所述源节点发往 目标节点的, 所述源节点和所述目标节点中的至少一个为第二虚拟机, 所述第 二虚拟机运行在所述 Host之上;
交换处理模块,用于根据所述待交换数据携带的目标节点的地址和配置的 端口映射表确定第二消息,所述第二消息用于指示所述目标节点从所述硬件层 的存储设备获取所述待交换数据;
发送模块, 用于向所述目标节点发送所述第二消息。
11. 根据权利要求 10所述的虚拟机, 其特征在于, 包括:
代理 Agent模块, 用于根据所述 Host发送的配置命令, 配置用于与所述 第二虚拟机进行通信的所述虚拟机的第一虚拟端口, 并配置用于与所述 I/O设 备进行通信的所述虚拟机的第二虚拟端口;
生成模块,用于建立所述第一虚拟端口与所述第二虚拟端口之间的映射关 系, 以生成所述端口映射表。
12. 根据权利要求 11所述的虚拟机, 其特征在于, 所述 Agent模块, 还 用于根据所述配置命令配置所述第二虚拟机对应的第一共享内存,其中所述第 一共享内存为所述硬件层的存储设备上的指定存储区域。
13. 根据权利要求 12所述的虚拟机, 其特征在于,
所述接收模块, 具体用于通过所述第一虚拟端口接收所述第一消息, 所述 第一消息包括用于向所述虚拟机指示所述源节点已完成将所述待交换数据写 入所述第一共享内存的写完中断;
所述交换处理模块,具体用于根据用于接收所述第一消息的所述第一虚拟 端口确定对应的所述第一共享内存的地址;从所述第一共享内存获取所述待交 换数据,根据所述待交换数据携带的所述目标节点的地址从所述端口映射表中 确定与所述目标节点对应的所述第二虚拟端口;确定携带有所述第一共享内存 的地址和读取指令的所述第二消息;
所述发送模块,具体用于通过所述第二虚拟端口向所述目标节点发送所述 第二消息;
其中, 所述源节点为所述第二虚拟机, 所述目标节点为所述 I/O设备。
14. 根据权利要求 12所述的虚拟机, 其特征在于,
所述接收模块, 具体用于接收源节点发送的所述第一消息;
所述交换处理模块, 具体用于获取所述待交换数据携带的目标节点的地 址;才艮据所述目标节点的地址查询所述端口映射表以确定与所述目标节点对应 的第一虚拟端口并确定与所述目标节点对应的第一共享内存的地址;
所述发送模块,具体用于通过所述源节点所对应的所述第二虚拟端口向所 述源节点发送携带有所述第一共享内存的地址的回复消息;
所述交换处理模块,还用于在接收到所述源节点发送的用于向所述虚拟机 指示所述源节点已完成将所述待交换数据写入所述第一共享内存的写完中断 时, 确定携带有读取指令的所述第二消息;
所述发送模块,还用于通过所述第一虚拟端口向所述目标节点发送所述第 二消息;
所述接收模块,还用于接收所述源节点发送的指示所述源节点已完成将所 述待交换数据写入所述第一共享内存的写完中断;
其中, 所述源节点为所述 I/O设备, 所述目标节点为所述第二虚拟机。
15. 根据权利要求 11所述的虚拟机, 其特征在于,
所述接收模块,具体用于通过所述第一虚拟端口接收所述源节点发送的所 述第一消息, 所述第一消息包括写完中断;
所述交换处理模块,具体用于根据用于接收所述第一消息的所述第一虚拟 端口确定所述第一虚拟端口对应的所述源节点的地址;才艮据所述源节点的地址 和所述待交换数据携带的目标节点的地址确定所述第二共享内存的地址;确定 携带有所述第二共享内存的地址和读取指令的所述第二消息;
所述发送模块, 具体用于向所述目标节点发送所述第二消息;
其中,所述至少一个 VM还包括第三虚拟机,所述源节点为所述第二虚拟 机, 所述目标节点为所述第三虚拟机。
16. 一种计算节点, 其特征在于, 包括: 硬件层、 运行在所述硬件层之上 的宿主机 Host、 以及运行在所述 Host之上的至少一个虚拟机 VM, 其中, 所 述硬件层包括输入 /输出 I/O设备和存储设备, 所述至少一个虚拟机 VM包括 具有虚拟交换功能的第一虚拟机,所述至少一个 VM还包括第二虚拟机,其中: 所述第一虚拟机, 用于接收源节点发送的第一消息, 所述第一消息用于请 求所述第一虚拟机对待交换数据进行交换处理,其中所述待交换数据是从所述 源节点发往目标节点的,所述源节点和所述目标节点中的至少一个为所述第二 虚拟机;
所述第一虚拟机,还用于根据所述待交换数据携带的目标节点的地址和配 置的端口映射表确定第二消息并发送所述第二消息,所述第二消息用于指示所 述目标节点从所述硬件层的存储设备获取所述待交换数据。
17. 根据权利要求 16所述的计算节点, 其特征在于,
所述 Host, 用于向所述第一虚拟机发送配置命令;
所述第一虚拟机,还用于根据所述配置命令配置用于与所述第二虚拟机进 行通信的所述第一虚拟机的第一虚拟端口,并配置用于与所述 I/O设备进行通 信的所述第一虚拟机的第二虚拟端口;
所述第一虚拟机,还用于建立所述第一虚拟端口与所述第二虚拟端口之间 的映射关系, 以生成所述端口映射表。
18. 根据权利要求 17所述的计算节点, 其特征在于,
所述第一虚拟机,还用于根据所述配置命令配置所述第二虚拟机对应的第 一共享内存,其中所述第一共享内存为所述硬件层的存储设备上的指定存储区 域。
19. 根据权利要求 18所述的计算节点, 其特征在于,
所述源节点, 用于将所述待交换数据写入所述第一共享内存;
所述源节点, 还用于向所述第一虚拟机发送所述第一消息;
所述第一虚拟机, 具体用于通过所述第一虚拟端口接收所述第一消息, 所 述第一消息包括用于向所述第一虚拟机指示所述源节点已完成将所述待交换 数据写入所述第一共享内存的写完中断;以及根据用于接收所述第一消息的所 述第一虚拟端口确定对应的所述第一共享内存的地址;从所述第一共享内存获 取所述待交换数据,根据所述待交换数据携带的所述 I/O设备的地址从所述端 口映射表中确定与所述 I/O设备对应的所述第二虚拟端口;确定携带有所述第 一共享内存的地址和读取指令的所述第二消息,并通过所述第二虚拟端口向所 述目标节点发送所述第二消息;
所述目标节点,用于根据所述第二消息从所述第一共享内存读取所述待交 换数据;
其中, 所述源节点为所述第二虚拟机, 所述目标节点为所述 I/O设备。
20. 根据权利要求 18所述的计算节点, 其特征在于,
所述第一虚拟机, 具体用于接收源节点发送的所述第一消息, 获取所述待 交换数据携带的目标节点的地址;根据所述目标节点的地址查询所述端口映射 表以确定与所述目标节点对应的第一虚拟端口并确定与所述目标节点对应的 第一共享内存的地址;通过所述源节点所对应的所述第二虚拟端口向所述源节 点发送携带有所述第一共享内存的地址的回复消息; 以及, 在接收到所述源节 点发送的用于向所述第一虚拟机指示所述源节点已完成将所述待交换数据写 入所述第一共享内存的写完中断时,确定携带有读取指令的所述第二消息,通 过所述第一虚拟端口向所述目标节点发送所述第二消息;
所述源节点 ,还用于根据所述回复消息中的所述第一共享内存的地址将所 述待交换数据写入所述第一共享内存;
所述源节点,还用于向所述第一虚拟机发送指示所述源节点已完成将所述 待交换数据写入所述第一共享内存的写完中断;
所述目标节点,用于根据所述第二消息从所述第一共享内存读取所述待交 换数据;
其中, 所述源节点为所述 I/O设备, 所述目标节点为所述第二虚拟机。
21. 根据权利要求 17所述的计算节点, 其特征在于,
所述源节点,还用于将所述待交换数据写入所述源节点与所述目标节点通 过所述第一虚拟机预先协商的第二共享内存,其中所述第二共享内存为所述硬 件层的存储设备上的指定存储区域;
所述源节点,还用于通过所述第一虚拟端口向所述第一虚拟机发送所述第 一消息, 所述第一消息包括写完中断;
所述第一虚拟机,具体用于根据用于接收所述第一消息的所述第一虚拟端 口确定所述第一虚拟端口对应的所述源节点的地址;才艮据所述源节点的地址和 所述待交换数据携带的目标节点的地址确定所述第二共享内存的地址;确定携 带有所述第二共享内存的地址和读取指令的所述第二消息,并向所述目标节点 发送所述第二消息;
所述目标节点,用于根据所述第二消息从所述第二共享内存读取所述待交 换数据;
其中,所述至少一个 VM还包括第三虚拟机,所述源节点为所述第二虚拟 机, 所述目标节点为所述第三虚拟机。
22. 根据权利要求 19至 21中任意一项所述的计算节点, 其特征在于, 所 述目标节点根据所述第二消息从所述共享内存读取所述待交换数据之后, 所述目标节点,还用于向所述第一虚拟机发送读完指示信息, 以便于所述 第一共享内存或所述第二共享内存被释放;
所述第一虚拟机, 还用于释放所述第一共享内存或所述第二共享内存。
23. 根据权利要求 16至 22中任意一项所述的计算节点, 其特征在于, 在 所述端口映射表为开放流 Openflow流表时, 在接收源节点发送的第一消息之 后,
所述第一虚拟机,还用于根据所述待交换数据携带的目标节点的地址, 在 所述 Openflow流表中确定与所述目标节点的地址所匹配的表项, 其中, 所述 Openflow 流表中包括至少一个表项, 所述表项包括地址、 虚拟端口和执行动 作参数;
如果所述匹配的表项存在,根据所述匹配的表项中与所述目标节点的地址 所对应的执行动作参数处理所述待交换数据;
如果所述匹配的表项不存在, 建立能够与所述待交换数据匹配的新表项, 并在所述 Openflow流表中插入所述新表项。
24. 一种计算机系统, 其特征在于, 包括: 至少一个如权利要求 16至 23 任意一项所述的计算节点。
PCT/CN2014/072502 2013-06-28 2014-02-25 虚拟交换方法、相关装置和计算机系统 WO2014206105A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP14818411.2A EP2996294A4 (en) 2013-06-28 2014-02-25 VIRTUAL SWITCHING, RELEVANT DEVICE AND COMPUTER SYSTEM
US14/486,246 US9996371B2 (en) 2013-06-28 2014-09-15 Virtual switching method, related apparatus, and computer system
US15/979,486 US10649798B2 (en) 2013-06-28 2018-05-15 Virtual switching method, related apparatus, and computer system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310270272.9A CN103346981B (zh) 2013-06-28 2013-06-28 虚拟交换方法、相关装置和计算机系统
CN201310270272.9 2013-06-28

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/486,246 Continuation US9996371B2 (en) 2013-06-28 2014-09-15 Virtual switching method, related apparatus, and computer system

Publications (1)

Publication Number Publication Date
WO2014206105A1 true WO2014206105A1 (zh) 2014-12-31

Family

ID=49281756

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/072502 WO2014206105A1 (zh) 2013-06-28 2014-02-25 虚拟交换方法、相关装置和计算机系统

Country Status (4)

Country Link
US (2) US9996371B2 (zh)
EP (1) EP2996294A4 (zh)
CN (1) CN103346981B (zh)
WO (1) WO2014206105A1 (zh)

Families Citing this family (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103346981B (zh) * 2013-06-28 2016-08-10 华为技术有限公司 虚拟交换方法、相关装置和计算机系统
CN103618809A (zh) * 2013-11-12 2014-03-05 华为技术有限公司 一种虚拟化环境下通信的方法、装置和系统
CN104660506B (zh) * 2013-11-22 2018-12-25 华为技术有限公司 一种数据包转发的方法、装置及系统
CN104661324B (zh) * 2013-11-22 2019-04-09 索尼公司 无线通信系统以及用在无线通信系统中的方法
US10120729B2 (en) 2014-02-14 2018-11-06 Vmware, Inc. Virtual machine load balancing
TWI531908B (zh) * 2014-04-24 2016-05-01 A method of supporting virtual machine migration with Software Defined Network (SDN)
US9515931B2 (en) * 2014-05-30 2016-12-06 International Business Machines Corporation Virtual network data control with network interface card
US9515933B2 (en) * 2014-05-30 2016-12-06 International Business Machines Corporation Virtual network data control with network interface card
WO2015196403A1 (zh) * 2014-06-26 2015-12-30 华为技术有限公司 软件定义网络的服务质量控制方法及设备
CN106664235B (zh) * 2014-08-19 2019-12-06 华为技术有限公司 软件定义网络与传统网络的融合方法以及装置
CN104219149B (zh) * 2014-08-26 2018-07-13 新华三技术有限公司 一种基于虚连接的报文传输方法和设备
US9489242B2 (en) * 2014-09-30 2016-11-08 Telefonaktiebolaget L M Ericsson (Publ) Algorithm for faster convergence through affinity override
US9594649B2 (en) 2014-10-13 2017-03-14 At&T Intellectual Property I, L.P. Network virtualization policy management system
US9445279B2 (en) * 2014-12-05 2016-09-13 Huawei Technologies Co., Ltd. Systems and methods for placing virtual serving gateways for mobility management
DE112015005728B4 (de) * 2014-12-22 2021-07-29 Servicenow, Inc. Automatisches Auffinden von Konfigurationselementen
CN104601468B (zh) * 2015-01-13 2018-10-09 新华三技术有限公司 报文转发方法和设备
WO2016188548A1 (en) * 2015-05-22 2016-12-01 Huawei Technologies Co., Ltd. Telecommunication network with automated control and data plane instantiation
CN107710662A (zh) * 2015-06-29 2018-02-16 华为技术有限公司 数据处理的方法及接收设备
CN106664242B (zh) 2015-07-03 2019-09-03 华为技术有限公司 一种网络的配置方法、网络系统和设备
US9992153B2 (en) 2015-07-15 2018-06-05 Nicira, Inc. Managing link aggregation traffic in edge nodes
US10243914B2 (en) * 2015-07-15 2019-03-26 Nicira, Inc. Managing link aggregation traffic in edge nodes
DE102015214424A1 (de) * 2015-07-29 2017-02-02 Robert Bosch Gmbh Verfahren und Vorrichtung zum Kommunizieren zwischen virtuellen Maschinen
CN109617816B (zh) * 2015-09-17 2020-08-14 杭州数梦工场科技有限公司 一种数据报文的传输方法和装置
US9729441B2 (en) 2015-10-09 2017-08-08 Futurewei Technologies, Inc. Service function bundling for service function chains
CN106612306A (zh) * 2015-10-22 2017-05-03 中兴通讯股份有限公司 虚拟机的数据共享方法及装置
WO2017127972A1 (zh) * 2016-01-25 2017-08-03 华为技术有限公司 一种数据传输方法以及宿主机
CN105573852B (zh) * 2016-02-03 2018-11-30 南京大学 一种虚拟地址隔离环境下超高速数据对象通信的方法
US20170279676A1 (en) * 2016-03-22 2017-09-28 Futurewei Technologies, Inc. Topology-based virtual switching model with pluggable flow management protocols
CN109074330B (zh) * 2016-08-03 2020-12-08 华为技术有限公司 网络接口卡、计算设备以及数据包处理方法
WO2018023498A1 (zh) 2016-08-03 2018-02-08 华为技术有限公司 网络接口卡、计算设备以及数据包处理方法
CN106302225B (zh) * 2016-10-18 2019-05-03 优刻得科技股份有限公司 一种服务器负载均衡的方法与装置
CN107278362B (zh) 2016-11-09 2019-04-05 华为技术有限公司 云计算系统中报文处理的方法、主机和系统
CN107278359B (zh) 2016-11-09 2020-09-18 华为技术有限公司 云计算系统中报文处理的方法、主机和系统
CN108111461B (zh) * 2016-11-24 2020-11-20 中移(苏州)软件技术有限公司 实现虚拟机访问管理网络的方法、装置、网关及系统
CN106603409B (zh) * 2016-11-30 2020-02-14 中国科学院计算技术研究所 一种数据处理系统、方法及设备
CN108243118B (zh) * 2016-12-27 2020-06-26 华为技术有限公司 转发报文的方法和物理主机
CN106874128B (zh) * 2017-01-22 2020-11-20 广州华多网络科技有限公司 数据传输方法及装置
JP2019029946A (ja) * 2017-08-03 2019-02-21 富士通株式会社 通信制御装置、通信制御システム、及び通信制御方法
CN107896195B (zh) * 2017-11-16 2020-04-24 锐捷网络股份有限公司 服务链编排方法、装置及服务链拓扑结构系统
CN110554977A (zh) * 2018-05-30 2019-12-10 阿里巴巴集团控股有限公司 数据缓存方法、数据处理方法、计算机设备、存储介质
CN110912825B (zh) 2018-09-18 2022-08-02 阿里巴巴集团控股有限公司 一种报文的转发方法、装置、设备及系统
CN111026324B (zh) * 2018-10-09 2021-11-19 华为技术有限公司 转发表项的更新方法及装置
CN109901909B (zh) * 2019-01-04 2020-12-29 中国科学院计算技术研究所 用于虚拟化系统的方法及虚拟化系统
CN111443985A (zh) * 2019-01-17 2020-07-24 华为技术有限公司 实例化虚拟网络功能的方法及设备
CN110048963B (zh) * 2019-04-19 2023-06-06 杭州朗和科技有限公司 虚拟网络中的报文传输方法、介质、装置和计算设备
CN111064671B (zh) * 2019-12-09 2022-05-06 南京中孚信息技术有限公司 数据包转发方法、装置及电子设备
CN111158905A (zh) * 2019-12-16 2020-05-15 华为技术有限公司 调整资源的方法和装置
US11681542B2 (en) * 2020-01-16 2023-06-20 Vmware, Inc. Integrating virtualization and host networking
US11736415B2 (en) * 2020-02-10 2023-08-22 Nokia Solutions And Networks Oy Backpressure from an external processing system transparently connected to a router
CN111556131B (zh) * 2020-04-24 2023-06-23 西安万像电子科技有限公司 求救信息处理方法、装置及系统
CN111988230B (zh) * 2020-08-19 2023-04-07 海光信息技术股份有限公司 虚拟机通信方法、装置、系统及电子设备
US11567794B1 (en) * 2020-09-30 2023-01-31 Virtuozzo International Gmbh Systems and methods for transparent entering of a process into a virtual machine
CN112565113A (zh) * 2020-12-23 2021-03-26 科东(广州)软件科技有限公司 多虚拟机间的网卡共享系统、方法、装置、设备及介质
CN113132155B (zh) * 2021-03-29 2022-02-22 新华三大数据技术有限公司 一种虚拟交换机分布式逃生方法、装置及存储介质
US20230032967A1 (en) * 2021-07-29 2023-02-02 Red Hat, Inc. Establishing process connections utilizing an intermediary broker
CN116954952B (zh) * 2023-09-18 2024-01-09 之江实验室 一种机器人的自适应混合通信方法、装置、介质及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090268608A1 (en) * 2005-11-30 2009-10-29 Nokia Siemens Networks Gmbh & Co. Kg Method and device for automatically configuring a virtual switching system
CN101630270A (zh) * 2009-07-22 2010-01-20 成都市华为赛门铁克科技有限公司 数据处理系统和方法
CN102132511A (zh) * 2008-08-27 2011-07-20 思科技术公司 用于虚拟机的虚拟交换机服务质量
CN103095546A (zh) * 2013-01-28 2013-05-08 华为技术有限公司 一种处理报文的方法、装置及数据中心网络
CN103346981A (zh) * 2013-06-28 2013-10-09 华为技术有限公司 虚拟交换方法、相关装置和计算机系统

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7739684B2 (en) * 2003-11-25 2010-06-15 Intel Corporation Virtual direct memory access crossover
CN100399273C (zh) * 2005-08-19 2008-07-02 联想(北京)有限公司 一种虚拟机系统及其硬件配置方法
US20070220217A1 (en) * 2006-03-17 2007-09-20 Udaya Shankara Communication Between Virtual Machines
JP4756603B2 (ja) * 2006-10-10 2011-08-24 ルネサスエレクトロニクス株式会社 データプロセッサ
CN101819564B (zh) * 2009-02-26 2013-04-17 国际商业机器公司 协助在虚拟机之间进行通信的方法和装置
CN102648455B (zh) * 2009-12-04 2015-11-25 日本电气株式会社 服务器和流控制程序
CN102103518B (zh) 2011-02-23 2013-11-13 运软网络科技(上海)有限公司 一种在虚拟化环境中管理资源的系统及其实现方法
WO2012114398A1 (en) * 2011-02-24 2012-08-30 Nec Corporation Network system, controller, and flow control method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090268608A1 (en) * 2005-11-30 2009-10-29 Nokia Siemens Networks Gmbh & Co. Kg Method and device for automatically configuring a virtual switching system
CN102132511A (zh) * 2008-08-27 2011-07-20 思科技术公司 用于虚拟机的虚拟交换机服务质量
CN101630270A (zh) * 2009-07-22 2010-01-20 成都市华为赛门铁克科技有限公司 数据处理系统和方法
CN103095546A (zh) * 2013-01-28 2013-05-08 华为技术有限公司 一种处理报文的方法、装置及数据中心网络
CN103346981A (zh) * 2013-06-28 2013-10-09 华为技术有限公司 虚拟交换方法、相关装置和计算机系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2996294A4 *

Also Published As

Publication number Publication date
US20180267816A1 (en) 2018-09-20
CN103346981A (zh) 2013-10-09
US10649798B2 (en) 2020-05-12
EP2996294A4 (en) 2016-06-08
CN103346981B (zh) 2016-08-10
US20150026681A1 (en) 2015-01-22
EP2996294A1 (en) 2016-03-16
US9996371B2 (en) 2018-06-12

Similar Documents

Publication Publication Date Title
US10649798B2 (en) Virtual switching method, related apparatus, and computer system
US11171830B2 (en) Multiple networks for virtual execution elements
US12010093B1 (en) Allocating addresses from pools
US11792126B2 (en) Configuring service load balancers with specified backend virtual networks
US10728145B2 (en) Multiple virtual network interface support for virtual execution elements
US12101253B2 (en) Container networking interface for multiple types of interfaces
US9031081B2 (en) Method and system for switching in a virtualized platform
US20220334864A1 (en) Plurality of smart network interface cards on a single compute node
US9176767B2 (en) Network interface card device pass-through with multiple nested hypervisors
WO2018023499A1 (zh) 网络接口卡、计算设备以及数据包处理方法
US20100287262A1 (en) Method and system for guaranteed end-to-end data flows in a local networking domain
US20140068045A1 (en) Network system and virtual node migration method
WO2016065643A1 (zh) 一种网卡配置方法及资源管理中心
US20230079209A1 (en) Containerized routing protocol process for virtual private networks
WO2014063463A1 (zh) 一种物理网卡管理方法、装置及物理主机
WO2013024377A1 (en) Virtual network overlays
EP4297359A1 (en) Metric groups for software-defined network architectures
Zhou Virtual networking
US12034652B2 (en) Virtual network routers for cloud native software-defined network architectures
CN118802776A (zh) 混合网络协议栈的数据传输方法、系统、设备及存储介质
CN117278428A (zh) 用于软件定义网络架构的度量组
WO2016091014A1 (zh) 基于边缘虚拟桥接的数据交换方法、系统及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14818411

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2014818411

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE