Background technology
In cloud computing environment, need the great amount of calculation node to go to accomplish the task that the user submits to, these computing nodes need the space to deposit the node calculated data temporarily.Owing to adopt parallel computation, the transmission that needs the rear end to connect storage is high speed, stable, reliable, ultralow time-delay.The bandwidth of main flow kilomega network can not satisfy the demand of cloud computing environment from now at present, and cloud computing environment main flow transmission bandwidth is that 10Gbps rises.Current infinite bandwidth (Infiniband; IB) technology is fit to the environment of cloud computing very much; At present its single port bandwidth is up to 56Gbps, delay time end-to-endly can reach for 200 nanoseconds, and no matter bandwidth is still delayed time all takes the lead ethernet technology far away; Its remote direct memory visit (Remote direct memory access; RDMA) technology makes CPU handle the communication bag and just accounts for whole system resource 3%, and most of valuable resource is given user job, and 97% flow communication is to realize to the far-end internal storage access through local internal memory.TCP/IP is the serious cpu resource that consumes when big flow communication; Though the 10Gb Ethernet is released TCP/IP load offload engine (TCP/IP Offload Engine at present; TOE) hardware-accelerated design, but its limited relieving capacity and considerable price are bigger obstacles.
With the TCP/IP compared with techniques of Ethernet, the InfiniBand technology has higher efficiency of transmission.Reason is that many procotols have the ability of the packet of forwarding loss, but owing to will constantly confirm and repeating transmission, also can be therefore slack-off based on the communication of these agreements, greatly influenced performance.And Transmission Control Protocol is extremely complicated, size of code is huge and be full of various special cases; And it is difficult to unloading (so-called unloading is exactly the running time that does not take CPU); By comparison; InfiniBand technology is used based on mechanism that trust, current control and is guaranteed the integrality that connects, and packet is seldom lost.Use the InfiniBand technology, possess enough spaces only if confirm to receive buffer memory, otherwise can not transmit data.Reciever returns the availability that credit indicates spatial cache after transfer of data finishes.Through this way, the InfiniBand technology has been eliminated the retransmission delay of bringing owing to former data-bag lost, thereby has promoted efficient and overall performance.See from the cost performance angle, the ten thousand mbit ethernet card prices of the infiniband of 40Gbps and 10Gbps quite or more cheap, but its ability has remarkable cost performance considerably beyond ten thousand mbit ethernet cards.
Storage protocol based on the infiniband technology is SCSI remote direct memory access protocal (SCSI RDMA Protocol; SRP); This agreement is in the InfiniBand technology, scsi command to be packed; Allow scsi command between different systems, to communicate, realize the shared and RDMA communication service of memory device through RDMA (remote direct memory visit).Infiniband technology realizes ICP/IP protocol at the infiniband network design through the IPoIB technology in addition, makes much application based on TCP/IP need not to change code and enjoys the high-quality service of the high bandwidth low delay that the infiniband Netowrk tape gives.
Simultaneously, because infiniband provides the ability of ultrahigh speed bandwidth and low delay, seriously limited its extended capability, general optical fiber maximum is no more than 300 meters, and common copper cable transmission has only 3-5 rice.Though the infiniband technology has the solution of long-distance transmissions at present, its equipment is very expensive, and general enterprise customer can't bear, even 100 meters optical fiber approach, its not good stability also is the obstacle of promoting with expensive price.
At present, the infiniband technology is mainly more at high-performance computing sector and data center's deployed, and its outstanding ability has obtained extensive approval, but how its autgmentability walks out data center, realizes that enterprise application is insurmountable so far problem.
Summary of the invention
To the problems referred to above; The present invention combines with Intel Virtualization Technology is ingenious through the memory technology with infiniband, thereby realizes that the ethernet user is through different network protocol (like ISCSI, NFS, samba, FTP, SFTP etc.) visit infiniband memory device at a high speed.
The present invention proposes a kind of infinite bandwidth method for converting protocol based on bridge joint; Be used to accomplish communicating by letter between infinite bandwidth network and the Ethernet; May further comprise the steps: storage server is through SCSI agreement control actual physical disks group; The SRP object machine obtains SCSI storing device information and registration after initialization, thereby and said storage server drive through infinite bandwidth network interface card and infinite bandwidth and connect the SCSI storing device information that cloud bridge middleware carries encapsulation and give said cloud bridge middleware; Cloud bridge middleware drives and said storage server communication through infinite bandwidth network interface card and infinite bandwidth; After the SRP object machine initialization on the said storage server finishes; SRP initiator on the said cloud bridge middleware gets access to the SCSI storing device information on the said storage server; After the virtualization modules on the said cloud bridge middleware is handled, be mapped as new SCSI stored information and be registered on the ISCSI object machine, simultaneously said cloud bridge middleware carry encapsulation through ICP/IP protocol and ethernet nic the SCSI stored information to client; The ISCSI initiator of said client is given topmost paper system and application program through SCSI storing device information and mapping that Ethernet obtains on the cloud bridge middleware.
The invention allows for a kind of infinite bandwidth protocol conversion system based on bridge joint; Said system comprises client, cloud bridge middleware and storage server; Said storage server connects said cloud bridge middleware through the infinite bandwidth network; Said client connects said cloud bridge middleware through Ethernet, it is characterized in that: said storage server is through the external mapping logic unit number of SRP agreement; The Magnetic Disk Controler module of said cloud bridge middleware is packaged into the physical disks of system identification with said LUN, manages said physical disks through a plurality of physical disks management group of cloud bridge middleware; Under the control of the cloud bridge logical volume controller in cloud bridge middleware, said physical disks management group is integrated, for the user divides and generate virtual disk; Said virtual disk shines upon to said client through the ISCSI agreement and uses.
Further, said storage server and Yun Qiao middleware adopt (SuSE) Linux OS, said customer end adopted Windows operating system.
Cloud bridge middleware proposed by the invention has following advantage:
(1) framework advantage, its rear end are the high speed storing of infiniband framework, through for front-end client Fast Ethernet (gigabit or 10,000,000,000) or infiniband network accessibility being provided after the resource consolidation of cloud bridge middleware;
(2) technical advantage, it realizes infiniband and Ethernet protocol transfer capability, realizes rear end storage resources integration, distribution according to need, online dilatation, multi-level buffer and RDMA through storage virtualization technology;
(3) safety guarantee, system can realize active and standby biplane framework, and system realizes 1+1 data backup on storage server and the cloud bridge middleware, and cloud bridge middleware and storage server are realized the HA high availability;
(4) cloud bridge middleware of the present invention has universal compatibility, plug and play, and client is suitable for the windows platform and the linux platform of main flow.
Method of the present invention and device can be widely used in fields such as high performance scientific calculating, video monitoring, intelligent transportation and Digital Media, to big data application scenarios storage, processing etc. are provided.
Embodiment
Below in conjunction with accompanying drawing principle of the present invention and characteristic are described, institute gives an actual example and only is used to explain the present invention, is not to be used to limit scope of the present invention.
Fig. 1 shows the transition diagram of protocol conversion system, and its medium cloud bridge middleware places between storage server and the client, and storage server connects cloud bridge middleware through the Infiniband network, and cloud bridge middleware connects client through Ethernet.Storage server and Yun Qiao middleware can be based on (SuSE) Linux OS, and client can be based on Windows operating system.
At the storage server end, linux system is through SCSI agreement control actual physical disks group.(SRP object machine module is obtained SCSI storing device information and the registration in the system to the SRP object machine, so that this module has the ability of treatment S CSI order for SRP Target, SRPT) module initialization.The initialization key of SRP object machine is exactly SRP object machine and scsi target device to be set up get in touch, make its have treatment S RP information, carry out scsi command conversion, carry out the IB management and service function such as RDMA passage be provided.Infiniband gives cloud bridge middleware as the SCSI storing device information that the bottom host-host protocol carries encapsulation.
At cloud bridge middleware, Infiniband network interface card (infiniband HCA) and Infiniband drive (infiniband Driver) as Infiniband network bottom layer communication module and storage server communication.After the SRP object machine initialization on the storage server finished, the SRP initiator on the cloud bridge middleware (SRP initiator) got access to the SCSI storing device information on the storage server.After virtualization modules is handled, be mapped as new SCSI stored information and be registered on ISCSI object machine (ISCSI Target) module.ICP/IP protocol and ethernet nic (Ethernet NIC) are given client as the SCSI information of the host-host protocol carrying encapsulation of bottom ethernet.
In client, ISCSI initiator (ISCSI initiator) is given topmost paper system and application program through SCSI storing device information and mapping that Ethernet obtains cloud bridge middleware.
Fig. 2 shows the logical architecture figure of the system that comprises client, cloud bridge middleware and storage server.The whole system framework can be divided into four levels:
First level mainly is to be accomplished by storage server.Rear end storage server cluster through the external mapping logic unit number of SRP agreement (Logical Unit Numbers, LUN).
Second level mainly is to be accomplished by cloud bridge middleware.The Magnetic Disk Controler module of cloud bridge middleware is packaged into the physical disks of system identification with these LUN, and (Physical Disk, PD) object, each LUN are thought a PD by system.A plurality of PD can be placed in the physical disks management group and managed, and the Different L UN that strides storage server can be managed into the mirror set pattern by the physical disks management group, realize the ability of storage server backup.Can realize that also a plurality of PD are placed on the physical disks management group and manage into the band group mode, promptly realize striding the raid0 of PD, the mechanism of raid5, improve the parallel ability of system greatly.
Tri-layer is mainly accomplished by cloud bridge middleware.Under the processing of cloud bridge logical volume controller, a plurality of physical disks management group are integrated, for the user is divided into a plurality of LUN, the LUN of generation is called as virtual disk (virtual disk), and virtual disk can be regarded as the LUN of cloud bridge middleware oneself.
The 4th level is mainly accomplished by cloud bridge middleware and client jointly, and cloud bridge middleware shines upon the LUN that oneself generates to front-end client through the ISCSI agreement and uses.
For fear of Single Point of Faliure, per two cloud bridge middlewares can be done the high availability of HA, and the HA system can realize active-backup pattern and active-active pattern.
For further simplified illustration, Fig. 3 shows the simplified model of cloud bridge middleware.The ISCSI protocol definition send, receive the rule and the method for data block (block) grade data at the TCP/IP network.ISCSI uses existing serial SCSI standard; ISCSI initiator end (client) passes through forwarded with scsi command and data encapsulation in the TCP/IP bag; ISCSI object machine end (cloud bridge middleware) is received after the TCP/IP bag; It is reduced to scsi command; The SRP object machine (storage server) that the virtualization system of cloud bridge middleware is mapped to the rear end with this SCSI Request Processing goes to handle, and simultaneously the data of returning is remapped and is encapsulated in the TCP/IP bag through the virtualization system of cloud bridge middleware and pass the client to ISCSI back.
Cloud bridge virtualization system is the core component of cloud bridge middleware; Solved the difficult problem of ISCSI (Ethernet) and SRP (infiniband network) protocol conversion cleverly, and a high speed, safe, reliable, capable of using and flexible net storage administration and service are provided.
The virtual logical volume view that turns to of the naked dish of physics that cloud bridge virtualization system comes rear end SRP mapping; Offer the unified logical volume device access node of front end; Concerning the client of initiating; Need not to be concerned about deposit data mode or the like on disk physical location, disk type and the disk of being visited, these details work are accomplished by cloud bridge virtualization system.In addition, cloud bridge virtualization system is responsible for the dish of rear end SRP mapping is regarded local physical disk unified management, is mapped as a big physical store pond through cloud bridge logical volume controller, supplies the request of ISCSI initiator end to distribute.
Can know by the analysis of said system functional mode; The virtual model of Yun Qiao promptly communicates negotiation with the ISCSI initiator of initiating the IO access request; Also the object machine with the SRP of backstage storage system communicates negotiation, and the Data Transmission Controlling that responds, so; It has realized two communications template, i.e. virtual front end of cloud bridge and the virtual rear ends of Yun Qiao.
Fig. 4 shows the structure chart of the virtual front end of Yun Qiao.The virtual front end of Yun Qiao is meant an end of the access request of handling the ISCSI initiator.Because the particularity of IP SAN; The form of the IO request data package of sending from initiator is the ISCSI packet of standard; Wherein comprised IO access control order to the logic scsi disk; And the ISCSI packet is to be encapsulated in the TCP/IP packet to transmit, and the virtual front end of Gu Yunqiao unpacks analysis with the TCP/IP packet that initiator sends, and is reduced into the IO access request control command to the SCSI of logic magnetic disc; And the logical address of scsi command carried out initial analysis, and judge this order the address field that will visit whether cross the border.So the basic function of the virtual front end of cloud bridge is exactly to handle the TCP/IP packet that comprises the order of ISCSI disk access that sends from each Terminal Server Client initiator continuously; After it was unpacked analysis accordingly, the rear end program that the scsi disk visit order after the reduction is sent to it was handled.
Fig. 5 shows the structure chart of the virtual rear end of Yun Qiao.Storage server is given the virtual rear end of cloud bridge through SRP agreement Map Disk, and is identified as the physics scsi disk in the virtual rear end of cloud bridge.The physical address of the scsi disk of how confirming to shine upon is that cloud bridge virtualization system must be considered.Yun Qiao safeguards virtual rear end the mapping table of a virtual disk logical address to the physical disk address; When the scsi disk visit order that comprises logical address arrives the virtual rear end of cloud bridge; After virtual rear end program can be carried out a series of mapping search through this mapping table, be the discernible physical disk of remote storage server address with the logic magnetic disc address spaces.
Like this, the order of distal end I SCSI access request can convert real scsi disk visit order to, and comprises effective command parameter and physical disk address.The virtual rear end of Yun Qiao is encapsulated in real disk access order in the infiniband bag as the SRP initiator, issues the SRP object machine of storage server through the infiniband network.After storage subsystem is handled the disk I accessing operation, return to some relevant informations of the virtual rear end of Yun Qiao, like data, state information or error message or the like.Cloud bridge virtual rear end joined these return informations in corresponding wait result's the disk access command queue and went this moment; And these message are returned to the virtual front end of Yun Qiao; And carry out the elementary state information operating in the formation of corresponding modification disk access, for example success, make mistakes or cancel and operating or the like.
In sum, the method and the device of the present invention's proposition have following effect:
1) become the bridge of bridge joint infiniband and Ethernet, solve the problem of infiniband expansion difference, it adopts the SRP agreement towards the storage rear end, and user oriented can adopt agreements such as ISCSI, NFS, samba, FTP, SFTP;
2) multi-level buffer of independent research technology fully merges with the RDMA technology of infiniband, through the high bandwidth of infiniband technology, the advantage of low delay.For example, under the enterprise hard disk test environment that SAS 7200 changes, single-user access can realize that the output of IO bandwidth reading scene 3GB/s, writes the ability of scene 2.7GB/s;
3) realize the Storage Virtualization ability, the rear end storage resources organized that preceding end subscriber is given in distribution according to need, can realize online dilatation function; Cloud bridge volume manager can realize that storage resources is integrated and the storage resources distribution capability, and the hard disk of user institute perception actual be the virtual hard disk of bridge joint cloud bridge middleware system;
4) can realize data management technique, snapshot, Clone, Mirror and CDP redundancy technique.
The above is merely preferred embodiment of the present invention, and is in order to restriction the present invention, not all within spirit of the present invention and principle, any modification of being done, is equal to replacement, improvement etc., all should be included within protection scope of the present invention.