CN101430674A - Intraconnection communication method of distributed virtual machine monitoring apparatus - Google Patents

Intraconnection communication method of distributed virtual machine monitoring apparatus Download PDF

Info

Publication number
CN101430674A
CN101430674A CNA2008102398997A CN200810239899A CN101430674A CN 101430674 A CN101430674 A CN 101430674A CN A2008102398997 A CNA2008102398997 A CN A2008102398997A CN 200810239899 A CN200810239899 A CN 200810239899A CN 101430674 A CN101430674 A CN 101430674A
Authority
CN
China
Prior art keywords
vmm
data
operating system
node
ring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2008102398997A
Other languages
Chinese (zh)
Other versions
CN101430674B (en
Inventor
宋忠雷
肖利民
陈思名
彭近兵
祝明发
马博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN2008102398997A priority Critical patent/CN101430674B/en
Publication of CN101430674A publication Critical patent/CN101430674A/en
Application granted granted Critical
Publication of CN101430674B publication Critical patent/CN101430674B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides an implementation method of a distributed virtual monitor communication technology. In the method, the existing reliable transmission protocol is utilized to realize reliable and efficient communication among virtual machine monitors (VMM), provide the necessary basis for resources integration, provide a single physical node for an upper customer operating system, and realize the management and use of multi-node resources by the customer operating system by mainly combining an expansion system equipment simulation part with a descriptor ring mechanism. The method comprises the following steps: step one: a preparation phase; step two: a connection establishment phase; step three: a data transmission process; and step four: a connection closing phase. The method is an innovation based on the existing mature technology, and has simple implementation and good use and development prospects.

Description

A kind of intraconnection communication method of distributed virtual machine monitoring apparatus
(1) technical field
The present invention relates to a kind of intraconnection communication method of distributed virtual machine monitoring apparatus, belong to the computer system virtualization technology, especially relate to a kind of interior connection letter technology with the distributed virtual machine monitor system of realizing the virtual target of a server group of planes.Belong to field of computer technology.
(2) background technology
Intel Virtualization Technology
Unit is virtual progressively ripe
Intel Virtualization Technology is since paying close attention to widely once occurring being subjected to people, now, increasing manufacturer begins to get involved, adding from the AMD of processor aspect and Intel to the Microsoft of operating system aspect is from the lofty tone response that emerges in large numbers server system manufacturer of the third party software manufacturer of One's name is legion.As seen the Intel Virtualization Technology development is swift and violent.
Xen is by the monitor of virtual machine of increasing income (VMM) of Cambridge University's computer laboratory exploitation, can create a plurality of virtual machines simultaneously thereon, operating system of each virtual machine operation.The Intel Virtualization Technology that Xen adopts is called as para-virtualization (half is virtual), and promptly VMM ignores to the general instruction of client operating system, need use hypercalls (hypercall) to replace to the sensitivity instruction of operating system.Great like this isolation and the performance that has improved system, but it is exactly that it needs the user that operating system is made amendment that a shortcoming is arranged.Along with AMD and Intel release the auxiliary virtual supporting technology of hardware, Xen also supports fully virtualized and hardware is assisted virtual.
VMware company is leader in the market, and it adopts fully virtualized (Full-Virtualization) technology.Fully virtualized technology is exactly that client operating system does not need to do any change and can run directly on the virtual machine, and it is conventional that it can allow the user to use, and the operating system that need not to revise is used as client.
Multimachine is virtual to be walked to go on the stage
Along with the growth of using computer resource, the unit resource has not satisfied user's demand, how to break through another important development direction of single host restriction becoming Intel Virtualization Technology.
The Virtual Multiprocessor project of Tokyo Univ Japan is based on the distributed VMM of an IA-32 group of planes.In this project, between VMM and hardware layer, moved the operating system of a simplification.It can be provided support by this layer host OS.VirtualMultiprocessor has adopted half virtualized technology, by the modification to client operating system it is cooperated with VirtualMultiprocessor and finishes the work.Virtual Multiprocessor runs on user's attitude, and its major function relies on the system call to host operating system, thereby inefficiency.Communication between its distributed VMM is to finish by the Transmission Control Protocol that calls lower floor's multiple operating system, and internodal communication is responsible for by operating system.
VNUMA is the distributed VMM based on an IA-64 group of planes of University of New South Wales's exploitation, and it runs on the bottom.Client operating system (Linux) is cooperated with vNUMA by accurate virtualized mode.The main target of vNUMA is that the distributed shared storage that provides transparent is used for science and calculates.The vNUMA system has adopted a kind of pre-virtualized method that is called as, and this method is proposed jointly by German Karlsruhe university, University of New South Wales and IBM.This is a kind of semi-automatic Guest building method that provides instrument to support, utilize the support of assembler, code to the Guest system scans, and part privileged instruction is wherein carried out static state replace, and adopts the profile method dynamically to seek and manual the replacement to instruction that can't static treatment.The guiding theory of this method is to adopt the compilation tool support, increases the instruction that can directly move as far as possible, reduces the instruction that needs simulation.The communication system of cross-node
Communication between VMM
Virtual Multiprocessor communication
The Virtual Multiprocessor project of Tokyo Univ Japan also is to realize distributed virtual machine.In this project, between VMM and hardware layer, move the operating system of a simplification, be called Host OS.The application protocol of oneself is used in communication between VMM, and it uses transmission and the reception of finishing data among the Host OS based on the ICP/IP protocol stack of Ethernet.Its advantage is that communication protocol is used existing ICP/IP protocol stack, to realizing bringing great convenience.And shortcoming also is obviously, it needs the support of host OS, and communication belongs to application layer, and is low on the efficient.
VNUMA communication
VNUMA is a kind of distributed VMM based on an IA-64 group of planes of University of New South Wales's exploitation.The VMM that it is based on the Anthem architecture runs directly on the hardware, and bottom communication is directly realized on hardware.But it is non-increasing income.Its advantage is that shortcoming is exactly the specialized hardware support by high on the hardware implementation efficiency, to realizing having brought difficulty.
Other strides efficient communication mechanism between node
The VMMC communication mechanism
Virtual Memory-Mapped Communication (VMMC) is a kind of communication mechanism based on virtual memory mappings, and it supports the data from the transmit leg virtual memory to take over party's virtual memory directly to transmit.Support supporting data zero-copy technique, but it needs special-purpose hardware supported.
Active?Messages
It is first ULN that is widely used (User Level Networking), is for parallel micro-computer development at first, uses on different network interface hardwares, then develops into a kind of assembly language that is used for communication.It is supported without any need for agreement, approaches hardware more, the efficient height, but relatively poor deadlock is handled and be synchronous.
Fast?Sockets
Its usefulness be GAM (Globally Addressable Memory) interface, the queue structure with the bottom does not display, got rid of the implicit execution of handling procedure, but provide cache management for message on the horizon and manage flow to avoid deadlock by consumer process.It is widely used in group of planes interconnection, and has promoted to solving the development of the related protocols such as node contention problem that caused by large quantities of transmission.Though it only needs a spot of copying data, need special-purpose platform interface to realize.
In sum, communication mainly contains two kinds between present VMM, and a kind of is in the VMM based on host OS, and communication is finished by the protocol stack among the host OS; Another is based on the VMM of hardware, and it finishes communication on the basis of proprietary communication hardware.The communication mechanism that the present invention sets forth is based on gigabit Ethernet, for the VMM that directly runs on the hardware provides communication.
(3) summary of the invention
The object of the present invention is to provide a kind of intraconnection communication method of distributed virtual machine monitoring apparatus, it mainly adopts ring mechanism to combine with reliable transport protocol, and it is auxiliary with effective notifying mechanism, utilize the high-speed interconnect local net network, for distributed virtual machine monitor provides reliably communication efficiently, finish the integration of cluster resource.
Method of the present invention is based on group system-a kind of multicomputer system that connects by the external the Internet network, be characterized in that each physical resource is distributed on a plurality of nodes, by virtual and module cooperation to each node resource, finish the integration to server resource, the computing machine in the cluster need be cooperated by the mode of network delivery message.The target of patent of the present invention is that (Symmetric Multi-Processors, SMP) monitor of virtual machine of characteristic provides reliably communication efficiently in order to utilize Intel Virtualization Technology to provide to have symmetric multiprocessor on Network of Workstation.The characteristics of distributed virtual machine monitor be on-line operation on physical hardware, and need not the help of operating system.
Patent of the present invention provides the monitor of virtual machine with SMP characteristic by at cluster nodes deploy VMM on the physical arrangement of Network of Workstation.And, be referred to as equipment domain (Dom0), and provide the reliable transport protocol stack by it in the linux operating system that the VMM operation was revised.By in VMM, realizing interaction mechanism efficiently with equipment domain, finish calling to equipment domain protocol stack, thereby reach between VMM reliably data transmission efficiently, make the business-like operating system of supporting the SMP structure need not to revise and promptly may operate in this virtual machine.
The communication module of VMM is responsible for the VMM that cloth is deployed on each physical nodes data transmission efficiently is provided reliably, makes VMM finish integration to Multi-processor Resources, for client operating system presents single mirror image.
A kind of intraconnection communication method of distributed virtual machine monitoring apparatus of the present invention, the specific implementation step is as follows:
Step 1, preparatory stage:
Each node VMM is the dicyclo headspace when creating virtual machine and being the virtual machine storage allocation, informs that the virtual machine page is unavailable, and initialization requests ring and service ring structure;
Each node VMM is starting outfit analog module dm when starting virtual machine respectively, and with the page-map at dicyclo place in the domain0 address space, for domain0 be operated in the dm module on the domain0 application program and use;
In each node VMM initialization of event channel respectively, event channel is managed by VMM, and presents to domain0 and use;
Step 2, connect the stage:
When client operating system starts:
1. start client operating system by administration module, at first be trapped among the VMM, VMM starts the qemu-dm module that equipment simulating is provided for client operating system after client operating system is distributed core resource;
Start the communication module part that we realized in the qemu-dm module:
For improving data transmission rate, we use three connections to communicate, and are respectively between the vcpu on the different physical nodes, remote I/O device access, long-range memory access;
The qemu-dm module operates in the application layer of domain0, and each node reads parameter from the configuration file of client operating system respectively, parameter is passed over, such as node IP address.Start node and create socket, begin connection is monitored, initiate connection request after node is created socket but not start; This process is the obstruction mode, that is to say to have only to set up to connect well just can down to carry out afterwards;
The kernel state that enters into domain0 by the system call mode uses the ICP/IP protocol stack to finish establishment of connection; The descriptor that to set up good connection respectively is saved in the connection array, uses when sending for data;
Step 3, data transmission procedure:
When client operating system skips leaf or can at first be trapped among the VMM during I/O device access;
VMM creates different requests according to the difference request to being absorbed in reason analysis, and the map addresses of the data that needs are sent joins in the request ring by write pointer then to domain0, and event channel is set at last;
Communication thread is waken up in the mode of event channel by VMM in the qemu-dm module, begins the request ring is carried out poll, need the transmission meeting by read pointer request be taken out as data, and sets up packet according to request, for packet adds the application layer head;
Data are imported the ICP/IP protocol stack among the domain0 into, by network interface card data are sent;
Destination node interrupts transferring to the ICP/IP protocol stack by network interface card after receiving data, protocol stack is preserved data behind the head that removes below the transport layer, and with the address be saved in one the ring descriptor in, by service ring write pointer its is put in the service ring and to go, inform that by event channel VMM handles then;
VMM reads service ring, obtains behind the address in the descriptor data map receiving data like this and finishing in client operating system;
Step 4, connection closed stage:
Client operating system is closed;
Discharge and start node resource and discharge remote resource by communication module;
Check whether data dispose in the ring;
With connection closed;
Discharge ring and with the related resource of communicating by letter;
The communication thread service is withdrawed from.
A kind of intraconnection communication method of distributed virtual machine monitoring apparatus of the present invention, its advantage and effect are: by utilizing existing reliable transport protocol stack, and finish communicating by letter of distributed VMM with the combination of mechanism such as event channel by dicyclo, the present invention has improved the high efficiency and the extensibility of communicating by letter in the distributed VMM system, and the present invention innovates on the existing mature technology basis, enforcement is more prone to, and has good use and development prospect.
(4) description of drawings
Fig. 1 DVMM entire system structural representation
Fig. 2 two node system module synoptic diagram
Fig. 3 overall architecture synoptic diagram of communicating by letter
Fig. 4 two node traffic model synoptic diagram
Fig. 5 descriptor rings structural representation
Fig. 6 communication process detailed maps
(5) embodiment
See also shown in Fig. 1 to 5, be communicated with the letter method in a kind of distributed virtual watch-dog of the present invention,
1. method general introduction
Patent of the present invention is based on group system-a kind of multicomputer system that connects by the external the Internet network, be characterized in that each physical resource is distributed on a plurality of nodes, by virtual and module cooperation to each node resource, finish the integration to server resource, the computing machine in the cluster need be cooperated by the mode of network delivery message.The target of patent of the present invention is that (Symmetric Multi-Processors, SMP) monitor of virtual machine of characteristic provides reliably communication efficiently in order to utilize Intel Virtualization Technology to provide to have symmetric multiprocessor on Network of Workstation.The characteristics of distributed virtual machine monitor be on-line operation on physical hardware, and need not the help of operating system.
Patent of the present invention provides the monitor of virtual machine with SMP characteristic by at cluster nodes deploy VMM on the physical arrangement of Network of Workstation.And, be referred to as equipment domain (Dom0), and provide the reliable transport protocol stack by it in the linux operating system that the VMM operation was revised.By in VMM, realizing interaction mechanism efficiently with equipment domain, finish calling to equipment domain protocol stack, thereby reach between VMM reliably data transmission efficiently, make the business-like operating system of supporting the SMP structure need not to revise and promptly may operate in this virtual machine.
The communication module of VMM is responsible for the VMM that cloth is deployed on each physical nodes data transmission efficiently is provided reliably, makes VMM finish integration to Multi-processor Resources, for client operating system presents single mirror image.
2. the characteristics of distributed VMM communication
Distributed VMM system runs directly on the physical hardware, is in charge of and integrates hardware resource, for operating in the operating system service on upper strata.Communication is as the part of distribution VMM system, for realizing other functions services of whole distributed VMM system.
Because VMM itself does not have the reliable transmission layer protocol, and the management of not responsible external unit (ethernet nic), and these be equipment domain all, finish reliable communication between VMM for utilizing Ethernet, need VMM itself to realize and the high efficiency interactive mechanism of equipment domain, finish use protocol stack among the equipment domain.Made full use of equipment domain like this, and needn't realize that numerous and diverse transport layer protocol and network interface card drive, and make VMM have more extensibility at VMM.
3. system architecture
VMM communication is divided into following module by functional sequence:
The pre-service of module one, VMM communication.
When Guest OS causes remote request, can be trapped among the VMM, at this moment VMM to Guest OS be absorbed in reason analysis, different communication is distinguished, and is added different heads, constituted the head of application layer.In the DVMM system, the kind of communication mainly contains:
IPI transmits the content of a register at every turn.
Figure A200810239899D00092
IOREQ and control information are transmitted in the remote equipment visit at every turn, are no more than 100 bytes.
Figure A200810239899D00093
DSM, the data volume of each communication is one page.
Figure A200810239899D00094
An instruction is transmitted in the remote I/O operation at every turn.
Communication thread is mutual in the last DM module of module two, VMM and dom0.
VMM is in kernel state for the two ends of communication, and promptly on the hardware, and the qemu-dm module runs on the domain0, is in user's attitude, and both must have data interaction.In order to make VMM and Dom0 swap data, introduce dicyclo mechanism, be called request ring and service ring.Because the data that send or receive are bigger usually, VMM and Dom0 be the switching entity data not, and announce by reference.Ring is deposited and is sent the Data Control structure, is called descriptor rings.Data that VMM will send or Dom0 are kept at the address of the data that receive in the ring in one, and by the authorization list among the VMM data place page is licensed to Dom0 and come to finish transmission by its mapping, so both reduce the shellfish of holding of data, improved efficient, increased the capacity of ring again.
Authorization list is for realizing data sharing between the different domain, and in DVMM, client operating system and Dom0 can be regarded as two domain of equity, and VMM should guarantee both independence, securities.Yet communication need allow dom0 send data among the GOS, perhaps receives data and offers GOS, is the copy that reduces data, needs to realize data sharing between the two.
Obviously, for request ring and service ring, they should be able to be by VMM and Dom0 common access, and more most importantly is, what their residing positions must safety, can not be revised by other module.Based on above consideration, send ring and receive ring and from the kernel page of GuestOS, reserve, and restriction Guest OS can not use this page.Use for Dom0 by the mode of mapping then.The page of Guest OS is managed by VMM, so it also can use.
As shown in Figure 5, the request ring is identical with the service ring structure, each ring has two pointers, be respectively read and write, use jointly by VMM and dom0, one end is read an end and is write, VMM can put into request in the ring by write pointer when request communication, and handle by the read pointer request of reading data are sent at the dom0 of the other end, for the service ring, after Dom0 received data, structure of newly-built ring was put into the service ring by write pointer with it, and VMM reads service and handle by read pointer then.
Event channel mechanism is for solving cooperating between VMM and the Dom0.In the DVMM system, for improving the efficient of communication, asynchronous system is adopted in communication, and VMM notifies dom0 kernel services device to handle request by event channel mechanism.VMM has also ensured the efficient of communication under the situation that does not influence its performance like this.
Event channel is provided with masked bits in essence, each represents a passage, use jointly by VMM and Dom0, when needs trigger the Dom0 incident, initiate communication request such as VMM, behind DSR,, and learnt that by Dom0 inquiry request of data needs to handle by set to event channel.Dom0 also can make amendment to it, but owing to it is safeguarded by VMM, so need finish by hypercalls.
Realize the communication thread server in module three, the qemu-dm module
In the DVMM system, what qemu-dm module itself was responsible for is the I/O simulation, and we make it go back the transmission of reliable news between responsible node when being responsible for the I/O simulation to the expansion that qemu-dm carries out.Communication thread mainly is responsible for establishment of connection, the poll of request and service ring, the transmission of data and reception.
When Guest OS created, qemu-dm was responsible for being established to the TCP link of the corresponding qemu-dm of other different nodes.Communication thread can be carried out poll to TCP socket and request ring respectively.When communication thread found that the purpose of this message is qemu-dm, it can pass to qemu-dm with message; When communication thread finds that the purpose of this message is VMM, the arrival that it can directly be put into message the service ring and utilize event channel notice VMM message.
When VMM need transmit information, the mode notifying communication thread that it can send the purpose and the message request of putting into of communication ring and utilize event channel.Communication thread can select the TCP corresponding with destination node to link after request sends the message of taking correspondence the ring away, and this message is sent to destination node by socket.
Communication thread can be done simple explanation to each bar message, and when it found that VMM need send the page, it at first can allow the back directly obtain data and to be sent to corresponding socket from this page with this page-map to the address space of oneself.Similar with the process that sends, when communication thread found that long-range page data arrives, it at first can be with the purpose page-map to the linear address space of oneself, and directly the page data among the socket is read in the purpose client page.
Because in multi-node system, each communication thread may safeguard that a plurality of TCP connect, so communication thread also needs to be responsible for to send forwarding or the corresponding bag of route between ring and the TCP socket in request.VMM only is responsible for communicating according to node serial number, and the pairing IP of node serial number address is transparent to VMM.So communication thread must be responsible for the bag of the required transmission of the VMM IP address according to its destination node correspondence, from connecting, the TCP of correspondence sends.
Protocol stack is to the transmission of data among module four, the domain0
Protocol stack that provides among the Dom0 and network device driver: because in VMM, external unit is in charge of by dom0, so network device driver and reliable transport protocol (TCP) that we can only multiplexing dom0 provide, Transmission Control Protocol is connection-oriented communication, and itself provide retransmission mechanism etc., thereby can realize the reliable communication between VMM.
By the combination of number of mechanisms, communication system is for the DVMM system provides simply, data transmission reliably but efficiently.
As shown in Figure 6, when Guest OS needs remote pages or I/O visit, can 1. be trapped among the VMM, VMM is absorbed in reason processing request according to GuestOS and 2. the request of data of needs transmission is put in the transmission ring, be in this request of application layer process poll in the 3. qemu-dm module this moment, when finding to have data to send, carry out the processing of 4. application layer and the protocol stack that calls among the dom0 sends by 5. network interface card data from Ethernet.Take over party 6. network interface card receives and interrupts after the data being received by protocol stack, 7. transfers to application layer process, 8. puts into and receives ring, and 9. inform VMM, and 10. VMM serves the whole like this transmission receiving course of Guest OS. and just finished.
4. system works flow process
Initial phase:
The initialization of communication module
Because communication protocol is connection-oriented C/S model, so each node is not that two category nodes are distinguished at initial phase by the system of symmetry during initialization: system chooses a node as starting node, and all the other nodes are as non-startup node.Start node and set up socket, and begin port monitored and connect, set up to call out behind the socket and connect, connection is saved in the array but not start node with the server end.
The initialization of ring:
Two descriptor rings are implemented in the reservation page of client operating system, start the initialization zero clearing of carrying out the page at client operating system, and the read-write pointer is set then, and the read-write pointer all points to the ring reference position, has judged whether data according to both differences.
The foundation of service:
The DVMM system adds two communication thread by the qemu-dm module is expanded, and one of them is for being responsible for sending the thread of data, and it sends ring to request and carries out poll to handling.Another is a thread of being responsible for receiving data, and it is caused by protocol stack among the equipment domain0, puts into the service ring after protocol stack receives data, handles for VMM.Thread is set up in qemu-dm module start-up course.
System's normal work stage:
Distributed VMM realizes the resource consolidation of multinode, for the module communication between VMM on the different nodes is provided.
Connect and set up: utilize socket to connect between multinode, starting node is the server end, and other node is the client end.
Data send, and: VMM has data to send request is presented in the request ring, and notice qemu-dm resume module utilizes the ICP/IP protocol stack to send then.
Data Receiving: put into after the ICP/IP protocol stack receives in the service ring, handle by VMM.
Below in conjunction with accompanying drawing, it is as follows that concrete implementation step is described in detail in detail:
Step 1, preparatory stage:
Each node VMM is the dicyclo headspace when creating virtual machine and being the virtual machine storage allocation, informs that the virtual machine page is unavailable, and initialization requests ring and service ring structure;
Each node VMM is starting outfit analog module dm when starting virtual machine respectively, and with the page-map at dicyclo place in the domain0 address space, for domain0 be operated in the dm module on the domain0 application program and use;
In each node VMM initialization of event channel respectively, event channel is managed by VMM, and presents to domain0 and use.
Step 2, connect the stage:
When client operating system starts:
1. start client operating system by administration module, at first be trapped among the VMM, VMM starts the qemu-dm module that equipment simulating is provided for client operating system after client operating system is distributed core resource;
Start the communication module part that we realized in the qemu-dm module:
For improving data transmission rate, we use three connections to communicate, and are respectively between the vcpu on the different physical nodes, remote I/O device access, long-range memory access;
The qemu-dm module operates in the application layer of domain0, and each node reads parameter from the configuration file of client operating system respectively, parameter is passed over, such as node IP address.Start node and create socket, begin connection is monitored, initiate connection request after node is created socket but not start.This process is the obstruction mode, that is to say to have only to set up to connect well just can down to carry out afterwards;
The kernel state that enters into domain0 by the system call mode uses the ICP/IP protocol stack to finish establishment of connection, and the descriptor that will set up good connection respectively is saved in the connection array, uses when sending for data.
Step 3, data transmission procedure:
When client operating system skips leaf or can at first be trapped among the VMM during I/O device access;
VMM creates different requests according to the difference request to being absorbed in reason analysis, and the map addresses of the data that needs are sent joins in the request ring by write pointer then to domain0, and event channel is set at last;
Communication thread is waken up in the mode of event channel by VMM in the qemu-dm module, begins the request ring is carried out poll, need the transmission meeting by read pointer request be taken out as data, and sets up packet according to request, for packet adds the application layer head;
Data are imported the ICP/IP protocol stack among the domain0 into, by network interface card data are sent;
Destination node interrupts transferring to the ICP/IP protocol stack by network interface card after receiving data, protocol stack is preserved data behind the head that removes below the transport layer, and with the address be saved in one the ring descriptor in, by service ring write pointer its is put in the service ring and to go, inform that by event channel VMM handles then;
VMM reads service ring, obtains behind the address in the descriptor data map receiving data like this and finishing in client operating system.
Step 4, connection closed stage:
Client operating system is closed;
Discharge and start node resource and discharge remote resource by communication module;
Check whether data dispose in the ring;
With connection closed;
Discharge ring and with the related resource of communicating by letter;
The communication thread service is withdrawed from.

Claims (2)

1, a kind of implementation method of the distributed virtual machine monitor communication technology is characterized in that: this implementation method step is as follows:
Step 1, preparatory stage:
Each node VMM is the dicyclo headspace when creating virtual machine and being the virtual machine storage allocation, informs that the virtual machine page is unavailable, and initialization requests ring and service ring structure;
Each node VMM is starting outfit analog module dm when starting virtual machine respectively, and with the page-map at dicyclo place in the domain0 address space, for domain0 be operated in the dm module on the domain0 application program and use;
In each node VMM initialization of event channel respectively, event channel is managed by VMM, and presents to domain0 and use;
Step 2, connect the stage:
When client operating system starts:
1. start client operating system by administration module, at first be trapped among the VMM, VMM starts the qemu-dm module that equipment simulating is provided for client operating system after client operating system is distributed core resource;
Start the communication module part that we realized in the qemu-dm module:
For improving data transmission rate, we use three connections to communicate, and are respectively between the vcpu on the different physical nodes, remote I/O device access, long-range memory access;
The qemu-dm module operates in the application layer of domain0, and each node reads parameter from the configuration file of client operating system respectively, parameter is passed over, such as node IP address; Start node and create socket, begin connection is monitored, initiate connection request after node is created socket but not start; This process is the obstruction mode, that is to say to have only to set up to connect well just can down to carry out afterwards;
The kernel state that enters into domain0 by the system call mode uses the ICP/IP protocol stack to finish establishment of connection; The descriptor that to set up good connection respectively is saved in the connection array, uses when sending for data;
Step 3, data transmission procedure:
When client operating system skips leaf or can at first be trapped among the VMM during I/O device access;
VMM creates different requests according to the difference request to being absorbed in reason analysis, and the map addresses of the data that needs are sent joins in the request ring by write pointer then to domain0, and event channel is set at last;
Communication thread is waken up in the mode of event channel by VMM in the qemu-dm module, begins the request ring is carried out poll, need the transmission meeting by read pointer request be taken out as data, and sets up packet according to request, for packet adds the application layer head;
Data are imported the ICP/IP protocol stack among the domain0 into, by network interface card data are sent;
Destination node interrupts transferring to the ICP/IP protocol stack by network interface card after receiving data, protocol stack is preserved data behind the head that removes below the transport layer, and with the address be saved in one the ring descriptor in, by service ring write pointer its is put in the service ring and to go, inform that by event channel VMM handles then;
VMM reads service ring, obtains behind the address in the descriptor data map receiving data like this and finishing in client operating system;
Step 4, connection closed stage:
Client operating system is closed;
Discharge and start node resource and discharge remote resource by communication module;
Check whether data dispose in the ring;
With connection closed;
Discharge ring and with the related resource of communicating by letter;
The communication thread service is withdrawed from.
CN2008102398997A 2008-12-23 2008-12-23 Intraconnection communication method of distributed virtual machine monitoring apparatus Expired - Fee Related CN101430674B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008102398997A CN101430674B (en) 2008-12-23 2008-12-23 Intraconnection communication method of distributed virtual machine monitoring apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008102398997A CN101430674B (en) 2008-12-23 2008-12-23 Intraconnection communication method of distributed virtual machine monitoring apparatus

Publications (2)

Publication Number Publication Date
CN101430674A true CN101430674A (en) 2009-05-13
CN101430674B CN101430674B (en) 2010-10-20

Family

ID=40646080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008102398997A Expired - Fee Related CN101430674B (en) 2008-12-23 2008-12-23 Intraconnection communication method of distributed virtual machine monitoring apparatus

Country Status (1)

Country Link
CN (1) CN101430674B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101859263A (en) * 2010-06-12 2010-10-13 中国人民解放军国防科学技术大学 Quick communication method between virtual machines supporting online migration
CN101957900A (en) * 2010-10-26 2011-01-26 中国航天科工集团第二研究院七○六所 Credible virtual machine platform
CN102045378A (en) * 2009-10-13 2011-05-04 杭州华三通信技术有限公司 Method for realizing full distribution of protocol stack process and distributed system
CN102708330A (en) * 2012-05-10 2012-10-03 深信服网络科技(深圳)有限公司 Method for preventing system from being invaded, invasion defense system and computer
CN102799465A (en) * 2012-06-30 2012-11-28 华为技术有限公司 Virtual interrupt management method and device of distributed virtual system
CN101667144B (en) * 2009-09-29 2013-02-13 北京航空航天大学 Virtual machine communication method based on shared memory
CN101751284B (en) * 2009-12-25 2013-04-24 华为技术有限公司 I/O resource scheduling method for distributed virtual machine monitor
CN103154891A (en) * 2010-10-01 2013-06-12 国际商业机器公司 Virtual machine stage detection
CN103701791A (en) * 2013-12-20 2014-04-02 中电长城网际系统应用有限公司 Server, terminal equipment, visual desktop system and operation method thereof
CN104956355A (en) * 2012-11-05 2015-09-30 Afl电信公司 Distributed test system architecture
CN105630576A (en) * 2015-12-23 2016-06-01 华为技术有限公司 Data processing method and apparatus in virtualization platform
CN106201349A (en) * 2015-12-31 2016-12-07 华为技术有限公司 A kind of method and apparatus processing read/write requests in physical host
CN106445642A (en) * 2016-10-27 2017-02-22 广东铂亚信息技术有限公司 Safety communication method based on virtual machine monitor and system
CN111241201A (en) * 2020-01-14 2020-06-05 厦门网宿有限公司 Distributed data processing method and system

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667144B (en) * 2009-09-29 2013-02-13 北京航空航天大学 Virtual machine communication method based on shared memory
CN102045378A (en) * 2009-10-13 2011-05-04 杭州华三通信技术有限公司 Method for realizing full distribution of protocol stack process and distributed system
CN102045378B (en) * 2009-10-13 2013-02-13 杭州华三通信技术有限公司 Method for realizing full distribution of protocol stack process and distributed system
CN101751284B (en) * 2009-12-25 2013-04-24 华为技术有限公司 I/O resource scheduling method for distributed virtual machine monitor
CN101859263A (en) * 2010-06-12 2010-10-13 中国人民解放军国防科学技术大学 Quick communication method between virtual machines supporting online migration
CN101859263B (en) * 2010-06-12 2012-07-25 中国人民解放军国防科学技术大学 Quick communication method between virtual machines supporting online migration
CN103154891A (en) * 2010-10-01 2013-06-12 国际商业机器公司 Virtual machine stage detection
CN103154891B (en) * 2010-10-01 2016-03-23 国际商业机器公司 Virtual machine stage detects
CN101957900A (en) * 2010-10-26 2011-01-26 中国航天科工集团第二研究院七○六所 Credible virtual machine platform
CN102708330A (en) * 2012-05-10 2012-10-03 深信服网络科技(深圳)有限公司 Method for preventing system from being invaded, invasion defense system and computer
CN102799465A (en) * 2012-06-30 2012-11-28 华为技术有限公司 Virtual interrupt management method and device of distributed virtual system
CN102799465B (en) * 2012-06-30 2015-05-27 华为技术有限公司 Virtual interrupt management method and device of distributed virtual system
CN104956355A (en) * 2012-11-05 2015-09-30 Afl电信公司 Distributed test system architecture
US9882963B2 (en) 2012-11-05 2018-01-30 Afl Telecommunications Llc Distributed test system architecture
CN104956355B (en) * 2012-11-05 2018-10-09 Afl电信公司 Distributed test system framework
CN103701791A (en) * 2013-12-20 2014-04-02 中电长城网际系统应用有限公司 Server, terminal equipment, visual desktop system and operation method thereof
CN103701791B (en) * 2013-12-20 2017-09-01 中电长城网际系统应用有限公司 A kind of operating method of the virtual desktop based on virtual desktop system
CN105630576A (en) * 2015-12-23 2016-06-01 华为技术有限公司 Data processing method and apparatus in virtualization platform
CN105630576B (en) * 2015-12-23 2019-08-20 华为技术有限公司 Data processing method and device in a kind of virtual platform
CN106201349A (en) * 2015-12-31 2016-12-07 华为技术有限公司 A kind of method and apparatus processing read/write requests in physical host
CN106201349B (en) * 2015-12-31 2019-06-28 华为技术有限公司 A kind of method and apparatus handling read/write requests in physical host
US10579305B2 (en) 2015-12-31 2020-03-03 Huawei Technologies Co., Ltd. Method and apparatus for processing read/write request in physical machine
CN106445642A (en) * 2016-10-27 2017-02-22 广东铂亚信息技术有限公司 Safety communication method based on virtual machine monitor and system
CN111241201A (en) * 2020-01-14 2020-06-05 厦门网宿有限公司 Distributed data processing method and system
CN111241201B (en) * 2020-01-14 2023-02-07 厦门网宿有限公司 Distributed data processing method and system

Also Published As

Publication number Publication date
CN101430674B (en) 2010-10-20

Similar Documents

Publication Publication Date Title
CN101430674B (en) Intraconnection communication method of distributed virtual machine monitoring apparatus
US11934341B2 (en) Virtual RDMA switching for containerized
US10936535B2 (en) Providing remote, reliant and high performance PCI express device in cloud computing environments
US8832688B2 (en) Kernel bus system with a hyberbus and method therefor
US8856801B2 (en) Techniques for executing normally interruptible threads in a non-preemptive manner
US8776050B2 (en) Distributed virtual machine monitor for managing multiple virtual resources across multiple physical nodes
Huang et al. A case for high performance computing with virtual machines
CN104871493B (en) For the method and apparatus of the communication channel failure switching in high-performance calculation network
US20050044301A1 (en) Method and apparatus for providing virtual computing services
US8601496B2 (en) Method and system for protocol offload in paravirtualized systems
US20050080982A1 (en) Virtual host bus adapter and method
JP2002342280A (en) Partitioned processing system, method for setting security in the same system and computer program thereof
Ren et al. Shared-memory optimizations for inter-virtual-machine communication
JP2004530196A (en) Resource balancing in partitioned processing environment
JP2004535615A (en) Shared I / O in partitioned processing environment
US7539987B1 (en) Exporting unique operating system features to other partitions in a partitioned environment
Barham et al. Xen 2002
Zhang et al. Workload adaptive shared memory management for high performance network i/o in virtualized cloud
Ren et al. Residency-aware virtual machine communication optimization: Design choices and techniques
Dai et al. A lightweight VMM on many core for high performance computing
CN114978589B (en) Lightweight cloud operating system and construction method thereof
Lu et al. Building efficient hpc cloud with sr-iov-enabled infiniband: The mvapich2 approach
Song et al. Research and design of inter-communication in DVMM
Huang High performance network I/O in virtual machines over modern interconnects
Zhang Designing and Building Efficient HPC Cloud with Modern Networking Technologies on Heterogeneous HPC Clusters

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: HUAWEI TECHNOLOGY CO LTD

Free format text: FORMER OWNER: BEIJING AERONAUTICS AND ASTRONAUTICS UNIV.

Effective date: 20110926

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100191 HAIDIAN, BEIJING TO: 518129 SHENZHEN, GUANGDONG PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20110926

Address after: 518129 headquarter office building of Bantian HUAWEI base, Longgang District, Shenzhen, Guangdong, China

Patentee after: Huawei Technologies Co., Ltd.

Address before: 100191 School of computer science and engineering, Beihang University, Xueyuan Road 37, Beijing, Haidian District

Patentee before: Beihang University

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20101020

Termination date: 20181223

CF01 Termination of patent right due to non-payment of annual fee