Background technology
In cloud computing environment, need the task that a large amount of computing node goes completing user to submit to, these computing nodes need space to deposit node calculate data temporarily.Owing to adopting parallel computation, needing rear end to connect the transmission stored is high speed, stable, reliable, ultralow time delay.The bandwidth of current main flow kilomega network can not meet the demand of cloud computing environment from now on, and cloud computing environment main flow transmission bandwidth is that 10Gbps rises.Current infinite bandwidth (Infiniband, IB) technology is applicable to the environment of cloud computing very much, its single port bandwidth is up to 56Gbps at present, time delay is end-to-end can reach for 200 nanoseconds, no matter bandwidth or time delay leading ethernet technology all far away, its remote direct memory access (Remote direct memory access, RDMA) technology makes CPU process communication bag just account for whole system resource 3%, the resource that major part is valuable gives user job, and 97% flow communication is realized to remote memory access by local internal memory.TCP/IP seriously consumes cpu resource when large discharge communication, although 10Gb Ethernet releases TCP/IP load offload engine (TCP/IP Offload Engine at present, TOE) hardware-accelerated design, but its limited relieving capacity and considerable price are larger obstacles.
Compared with the TCP/IP technology of Ethernet, InfiniBand technology has higher efficiency of transmission.Reason is that many procotols have the ability of the packet forwarding loss, but owing to will constantly confirm and repeating transmission, the communication based on these agreements also can be therefore slack-off, greatly have impact on performance.And Transmission Control Protocol is extremely complicated, size of code is huge and be filled with various special case, and it is difficult to unloading (so-called unloading is exactly the running time not taking CPU), by comparison, InfiniBand technology use based on trust, the mechanism of current control guarantees the integrality that connects, packet is seldom lost.Use InfiniBand technology, possess enough spaces except non-acknowledgement receives buffer memory, otherwise can not data be transmitted.Reciever, after transfer of data is complete, returns credit to indicate the availability of spatial cache.By this way, InfiniBand technology eliminates the retransmission delay brought due to former data-bag lost, thus improves efficiency and overall performance.From cost performance angle, the ten thousand mbit ethernet card prices of infiniband and 10Gbps of 40Gbps are quite or more cheap, but its ability is considerably beyond ten thousand mbit ethernet cards, have remarkable cost performance.
Storage protocol based on infiniband technology is SCSI remote direct memory access protocal (SCSI RDMA Protocol, SRP), this agreement is packed by scsi command in InfiniBand technology, allow scsi command to be communicated between different systems by RDMA (remote direct memory access), realize memory device and share and RDMA communication service.Infiniband technology is by IPoIB technology in addition, realizes ICP/IP protocol at infiniband network design, make much based on the application of TCP/IP to enjoy without the need to changing code infiniband Netowrk tape to the high-quality service of high bandwidth low delay.
Meanwhile, because infiniband provides the ability of ultrahigh speed bandwidth sum low delay, seriously limit its extended capability, general optical fiber is maximum is no more than 300 meters, and common copper cable transmission only has 3-5 rice.Although infiniband technology has the solution of long-distance transmissions at present, its equipment is very expensive, and general enterprises user cannot bear, even the optical fiber approach of 100 meters, its not good stability and expensive price are also the obstacles promoted.
At present, infiniband technology is mainly disposed more in high-performance computing sector and data center, and its outstanding ability obtains extensive accreditation, but its autgmentability, how to walk out data center, and realizing enterprise's application is insurmountable problem so far.
Summary of the invention
For the problems referred to above, the present invention passes through the memory technology of infiniband and the ingenious combination of Intel Virtualization Technology, thus realizes ethernet user by different network protocol (as ISCSI, NFS, samba, FTP, SFTP etc.) access infiniband memory device at a high speed.
The present invention proposes a kind of InfiniBand protocol conversion method based on bridge joint, for completing the communication between infinite bandwidth network and Ethernet, comprise the following steps: storage server is by SCSI protocol integrated test system actual physical disks group, SRP object machine obtains SCSI storing device information and registers after initialization, and described storage server passes through infinite bandwidth network interface card and infinite bandwidth drives connection cloud bridge middleware thus carries the SCSI storing device information of encapsulation to described cloud bridge middleware; Cloud bridge middleware is driven and described storage server communication by infinite bandwidth network interface card and infinite bandwidth, after the SRP object machine initialization on described storage server terminates, SRP initiator on described cloud bridge middleware gets the SCSI storing device information on described storage server, after the virtualization modules process on described cloud bridge middleware, be mapped as new SCSI store information and be registered on ISCSI object machine, described cloud bridge middleware stores information to client by the SCSI of ICP/IP protocol and ethernet nic carrying encapsulation simultaneously; The ISCSI initiator of described client obtains the SCSI storing device information on cloud bridge middleware by Ethernet and maps to topmost paper system and application program.
The invention allows for a kind of infinite bandwidth protocol conversion system based on bridge joint, described system comprises client, cloud bridge middleware and storage server, described storage server connects described cloud bridge middleware by infinite bandwidth network, described client connects described cloud bridge middleware by Ethernet, it is characterized in that: described storage server is by the external mapping logic unit number of SRP agreement; Described LUN is packaged into the physical disks of system identification by the Magnetic Disk Controler module of described cloud bridge middleware, manages physical disks described in fabric anomaly by multiple physical disks of cloud bridge middleware; Under the control of the cloud bridge logical volume controller in cloud bridge middleware, described physical disks management group is integrated, for user divides and generating virtual disk; Described virtual disk is mapped by ISCSI agreement and uses to described client.
Further, described storage server and Yun Qiao middleware adopt (SuSE) Linux OS, and described client adopts Windows operating system.
Cloud bridge middleware proposed by the invention has following advantage:
(1) framework advantage, its rear end is the high speed storing of infiniband framework, by after the resource consolidation of cloud bridge middleware for front-end client provides Fast Ethernet (gigabit or 10,000,000,000) or infiniband network accessibility;
(2) technical advantage, it realizes infiniband and Ethernet protocol transfer capability, realizes rear end storage resources integration, distribution according to need, on-line rapid estimation, multi-level buffer and RDMA by storage virtualization technology;
(3) safety guarantee, system can realize active and standby biplane framework, and system realizes 1+1 data backup on storage server and cloud bridge middleware, and cloud bridge middleware and storage server realize HA high availability;
(4) cloud bridge middleware of the present invention has universal compatibility, plug and play, and client is suitable for windows platform and the linux platform of main flow.
Method of the present invention and device can be widely used in the fields such as high performance cementitious materisl, video monitoring, intelligent transportation and Digital Media, provide storage, process etc. to large market demand scene.
Embodiment
Be described principle of the present invention and feature below in conjunction with accompanying drawing, example, only for explaining the present invention, is not intended to limit scope of the present invention.
Fig. 1 shows the transition diagram of protocol conversion system, and its medium cloud bridge middleware is placed between storage server and client, and storage server connects cloud bridge middleware by Infiniband network, and cloud bridge middleware connects client by Ethernet.Storage server and Yun Qiao middleware can based on (SuSE) Linux OS, and client can based on Windows operating system.
At storage server end, linux system is by SCSI protocol integrated test system actual physical disks group.SRP object machine (SRP Target, SRPT) module initialization, the SCSI storing device information in SRP object machine module acquisition system is also registered, with the ability making this module have treatment S CSI order.The initialization key of SRP object machine is exactly SRP object machine and scsi target device are set up contact, and makes it have treatment S RP information, carries out the conversion of scsi command, carries out IB management and provide the service functions such as RDMA passage.Infiniband carries the SCSI storing device information of encapsulation to cloud bridge middleware as bottom host-host protocol.
At cloud bridge middleware, Infiniband network interface card (infiniband HCA) and Infiniband drive (infiniband Driver) as Infiniband network bottom layer communication module and storage server communication.After the SRP object machine initialization on storage server terminates, the SRP initiator (SRP initiator) on cloud bridge middleware gets the SCSI storing device information on storage server.After virtualization modules process, be mapped as new SCSI store information and be registered in ISCSI object machine (ISCSI Target) module.ICP/IP protocol and ethernet nic (Ethernet NIC) as the SCSI information of the host-host protocol carrying encapsulation of bottom ethernet to client.
In client, ISCSI initiator (ISCSI initiator) obtains the SCSI storing device information of cloud bridge middleware by Ethernet and maps to topmost paper system and application program.
Fig. 2 shows the logical architecture figure of the system comprising client, cloud bridge middleware and storage server.Whole system framework can be divided into four levels:
First level is mainly completed by storage server.Rear end storage server cluster is by the external mapping logic unit number of SRP agreement (Logical Unit Numbers, LUN).
Second level is mainly completed by cloud bridge middleware.These LUN are packaged into physical disks (Physical Disk, the PD) object of system identification by the Magnetic Disk Controler module of cloud bridge middleware, and each LUN is thought a PD by system.Multiple PD can be placed in physical disks management group and be managed, and the Different L UN across storage server can be managed fabric anomaly by physical disks and become mirror set pattern, realizes the ability of storage server backup.Also can realize multiple PD be placed on physical disks management group and manage into slice-group pattern, namely realize the mechanism of raid0, the raid5 across PD, substantially increase the parallel ability of system.
Third layer time completes primarily of cloud bridge middleware.Under the process of cloud bridge logical volume controller, multiple physical disks management group integrated, for user is divided into multiple LUN, the LUN of generation is called as virtual disk (virtual disk), and virtual disk can be considered as the LUN of cloud bridge middleware oneself.
4th level completes jointly primarily of cloud bridge middleware and client, and the LUN oneself generated is mapped by ISCSI agreement and uses to front-end client by cloud bridge middleware.
In order to avoid Single Point of Faliure, every two cloud bridge middlewares can do the high availability of HA, and HA system can realize active-backup pattern and active-active pattern.
In order to further simplified illustration, Fig. 3 shows the simplified model of cloud bridge middleware.ISCSI protocol definition sends at TCP/IP network, receive the rule and methodology of data block (block) DBMS.ISCSI uses existing serial SCSI standard, scsi command and data encapsulation are passed through forwarded by ISCSI initiator end (client) in TCP/IP bag, after ISCSI object machine end (cloud bridge middleware) receives TCP/IP bag, be reduced to scsi command, the SRP object machine (storage server) that this SCSI asks process to be mapped to rear end is gone process by the virtualization system of cloud bridge middleware, and the data returned being remapped through the virtualization system of cloud bridge middleware and are encapsulated in TCP/IP bag is simultaneously returned to ISCSI client.
Cloud bridge virtualization system is the core component of cloud bridge middleware, solve ISCSI(Ethernet cleverly) and SRP(infiniband network) difficult problem of protocol conversion, and provide a high speed, safety, reliable, can utilize and network memory management and service flexibly.
The naked dish of physics that SRP mapping in rear end comes by cloud bridge virtualization system is virtual turns to logical volume view, be supplied to the logical volume device access node that front end is unified, concerning the client initiated, without the need to deposit data mode on disk physical location, disk type and disk that care is accessed etc., these detail works are completed by cloud bridge virtualization system.In addition, the dish that cloud bridge virtualization system is responsible for rear end SRP to map as local physical disk unified management, is mapped as a large physical store pond, for ISCSI initiator end request dispatching by cloud bridge logical volume controller.
Analyzed from said system functional mode, namely cloud bridge virtualization model carries out communication negotiation with the ISCSI initiator of initiating IO access request, also communication negotiation is carried out with the object machine of the SRP of backstage storage system, and carry out the Data Transmission Controlling that responds, so, it achieve two communications modules, i.e. the virtual front end of cloud bridge and the virtual rear end of Yun Qiao.
Fig. 4 shows the structure chart of the virtual front end of Yun Qiao.The virtual front end of Yun Qiao refers to one end of the access request of process ISCSI initiator.Due to the particularity of IP SAN, the form of the I/O request packet sent from initiator is the ISCSI packet of standard, wherein contain the IO access control order to logic scsi disk, and ISCSI packet is encapsulated in TCP/IP packet to transmit, therefore the TCP/IP packet that initiator sends by the virtual front end of cloud bridge carries out unpacking analysis, be reduced into the IO access request control command of the SCSI to logic magnetic disc, and initial analysis is carried out to the logical address of scsi command, and judge whether the address field that this order will be accessed crosses the border.Therefore the basic function of the virtual front end of cloud bridge is exactly, and process is continuously from the TCP/IP packet comprising the order of ISCSI disk access that each Terminal Server Client initiator sends, after unpacking analysis accordingly to it, its rear end program is sent to process the scsi disk visit order after reduction.
Fig. 5 shows the structure chart of the virtual rear end of Yun Qiao.Storage server passes through SRP agreement Map Disk to the virtual rear end of cloud bridge, and is identified as physics scsi disk in the virtual rear end of cloud bridge.How to determine that the physical address mapping the scsi disk come is that cloud bridge virtualization system must be considered.Yun Qiao safeguards virtual rear end the mapping table of a virtual disk logical address to physical disk address, when the virtual rear end of the scsi disk visit order arrival cloud bridge comprising logical address, after virtual rear end program can carry out a series of Map Searching by this mapping table, be the discernible physical disk address of remote storage server by logic magnetic disc address spaces.
Like this, the order of distal end I SCSI access request can convert real scsi disk visit order to, and comprises effective command parameter and physical disk address.The SRP object machine of storage server, as SRP initiator, by real disk access Command encapsulate in infiniband bag, is issued in the virtual rear end of Yun Qiao by infiniband network.After storage subsystem processes disk I/O accessing operation, return to some relevant informations of the virtual rear end of Yun Qiao, as data, state information or error message etc.Now these return informations are joined corresponding wait in the disk access command queue of result by the virtual rear end of cloud bridge, and these message are returned to the virtual front end of Yun Qiao, and the elementary state information operating revised accordingly in disk access queue, such as successfully, make mistakes or cancel operation etc.
In sum, the present invention propose method and device there is following effect:
1) become the bridge of bridge joint infiniband and Ethernet, solve the problem of infiniband expansion difference, it adopts SRP agreement towards storage rear end, and user oriented can adopt the agreements such as ISCSI, NFS, samba, FTP, SFTP;
2) the multi-level buffer technology of independent research and the RDMA technology of infiniband fully merge, by the high bandwidth of infiniband technology, the advantage of low delay.Such as, under the enterprise hard disk test environment of SAS 7200 turns, single-user access can realize the output of IO bandwidth and read scene 3GB/s, writes the ability of scene 2.7GB/s;
3) realize Storage Virtualization ability, rear end storage resources is organized, distribute according to need to front end user, on-line rapid estimation function can be realized; Cloud bridge volume manager can realize storage resources and integrate and storage resources distribution capability, and the hard disk of user institute perception actual be the virtual hard disk of bridge joint cloud bridge middleware system;
4) data management technique, snapshot, Clone, Mirror and CDP redundancy technique can be realized.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.