CN105653494B

CN105653494B - All-in-one machine

Info

Publication number: CN105653494B
Application number: CN201410635078.0A
Authority: CN
Inventors: 吴聿旻
Original assignee: Hangzhou Huawei Digital Technologies Co Ltd
Current assignee: XFusion Digital Technologies Co Ltd
Priority date: 2014-11-12
Filing date: 2014-11-12
Publication date: 2018-06-26
Anticipated expiration: 2034-11-12
Also published as: CN105653494A

Abstract

The invention discloses a kind of all-in-one machines, belong to information technology field.The method includes：Multiple calculate nodes, multiple buffer units, low delay LL switching matrixs unit and input/output interface；Multiple calculate nodes are connect with multiple buffer units, and LL switching matrixs unit is connect respectively with multiple buffer units, input/output interface；Wherein, input/output interface is used for transmission inputoutput data, and buffer unit is used for memory buffers consistent data, and is transmitted to calculate node after buffer consistency data are pre-processed, LL switching matrix units, for carrying out the conversion of inputoutput data and buffer consistency data.The present invention solve calculate node computation burden and memory storage burden it is larger, and data transmission is delayed the problem of higher, the computation burden for reducing calculate node and the storage burden of memory, and the relatively low effect that is delayed of data transmission are realized, for data management.

Description

All-in-one machine

Technical field

The present invention relates to information technology field, more particularly to a kind of all-in-one machine.

Background technology

As demand of the enterprise to the management of data central integral, automation O＆M is increasingly urgent, infrastructure one is merged Body machine comes into being.Fusion infrastructure all-in-one machine has merged blade server, distributed storage and the network switch in one, And integrate power intelligent network adapter, solid state disk (English：Solid State Drives；Referred to as：SSD) storage card and InfiniBand (infinite bandwidth technology) Switching Module, integrated distributed storage engines, virtual platform and cloud management software, resource can be adjusted on demand Match, linear expansion.Wherein, blade server refers to insert the server of multiple cassettes in the rack cabinet of calibrated altitude Unit, for blade server as blade, each blade server is actually a block system mainboard.

In the prior art, since the arithmetic speed of the calculate node of blade server is more many soon than the speed of memory read-write, Inputoutput data is waited for arrive or inputoutput data is written in memory so as to which calculate node be made to take a long time, In order to solve the problems, such as that the calculating speed of calculate node and memory read-write speed are unmatched, typically give setting in memory one friendship Speed many buffer units faster than memory are changed, in this way, when calculate node needs to call a large amount of inputoutput datas, it is possible to The data of needs are called from buffer unit.

But since buffer unit is provided in the memory of calculate node, not only can committed memory capacity, but also Influence whether the calculating speed or storage performance of other non-caching data, therefore, the computation burden of calculate node and depositing for memory Storage burden is larger, and data transmission delay is higher.

Invention content

It is larger in order to solve the storage of the computation burden of calculate node and memory burden, and data transmission delay higher is asked Topic, the present invention provides a kind of all-in-one machines.The technical solution is as follows：

In a first aspect, providing a kind of all-in-one machine, the all-in-one machine includes：

Multiple calculate nodes, multiple buffer units, low delay LL switching matrixs unit and input/output interface；

The multiple calculate node is connect with the multiple buffer unit, the LL switching matrixs unit respectively with it is described more A buffer unit, input/output interface connection；

Wherein, the input/output interface is used for transmission inputoutput data, and the buffer unit is used for memory buffers one Cause property data, and the calculate node is transmitted to after the buffer consistency data are pre-processed, the LL switching matrixs Unit, for carrying out the conversion of the inputoutput data and the buffer consistency data.

With reference to first aspect, the first can in realization mode,

Each calculate node is delayed by external external high speed input and output universal serial bus PCIe channels with corresponding Memory cell connects.

With reference to first aspect, in second of achievable mode,

The all-in-one machine includes：PCIe crosspoints,

The PCIe crosspoints connect the multiple calculate node and the multiple buffer unit respectively, for carrying out institute State the conversion for the PCIe data that buffer consistency data are supported with the multiple calculate node.

With reference to first aspect to second of achievable mode, the third can in realization mode,

The buffer unit includes cache controller, and the cache controller is used to carry out the buffer consistency data Pretreatment.

Mode can be realized with reference to the third, in the 4th kind of achievable mode,

The cache controller is made of Advanced Reduced Instruction Set machine ARM, and data friendship can be carried out between multiple ARM Mutually, each cache controller includes multiple memory bars.

With reference to the 4th kind of achievable mode, in the 5th kind of achievable mode,

The memory bar is dual inline memory module DIMM.

Mode can be realized with reference to the third, in the 6th kind of achievable mode,

In on-site programmable gate array FPGA unit, Reduced Instruction Set Computer Risc and application-specific integrated circuit Asic At least one.

With reference to first aspect, in the 7th kind of achievable mode,

The pretreatment includes：Correction process unpacks at least one of processing and package processing.

The present invention provides a kind of all-in-one machine, since multiple calculate nodes being connect, and pass through LL with multiple buffer units Switching matrix unit is connect respectively with multiple buffer units, input/output interface so that inputoutput data can be handed over by LL It changes matrix unit and is converted to buffer consistency data, buffer consistency data are transmitted to after being pre-processed using buffer unit Calculate node, therefore all-in-one machine can receive LL switching matrixs in the framework of buffer consistency by plug-in buffer unit The inputoutput data of cell processing, so as to reduce the storage of the computation burden of calculate node and memory burden, and data pass Defeated delay is relatively low.

Description of the drawings

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present invention, for For those of ordinary skill in the art, without creative efforts, other are can also be obtained according to these attached drawings Attached drawing.

Fig. 1 is a kind of structure diagram of all-in-one machine provided in an embodiment of the present invention；

Fig. 2 is the structure diagram of another all-in-one machine provided in an embodiment of the present invention；

Fig. 3 is the structure diagram of another all-in-one machine provided in an embodiment of the present invention.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.

The embodiment of the present invention provides a kind of all-in-one machine, as shown in Figure 1, the all-in-one machine includes：Multiple calculate nodes (Ser) 01, multiple buffer units 02, low delay (English：Low Latency；Referred to as：LL) switching matrix unit 03 and input and output connect Mouth 04.

Multiple calculate nodes 01 are connect with multiple buffer units 02, LL switching matrixs unit 03 respectively with multiple buffer units 02nd, input/output interface 04 connects.

Wherein, input/output interface 04 is used for transmission inputoutput data, and buffer unit 02 is used for memory buffers consistency Data, and calculate node 01 is transmitted to after buffer consistency data are pre-processed, LL switching matrixs unit 03, for carrying out Inputoutput data and the conversion of buffer consistency data.

Wherein, LL switching matrixs unit be a kind of a kind of high frequency in multinuclear processing chip, high bandwidth interconnection structure, With low delay characteristic.

In conclusion all-in-one machine provided in an embodiment of the present invention, since multiple calculate nodes and multiple buffer units being connected It connects, and passes through LL switching matrixs unit and connect respectively with multiple buffer units, input/output interface so that inputoutput data can Using by LL switching matrixs cell translation, as buffer consistency data, buffer consistency data are located in advance using buffer unit Calculate node is transmitted to after reason, therefore all-in-one machine can be received in the framework of buffer consistency by plug-in buffer unit The inputoutput data of LL switching matrix cell processings, so as to which the storage for reducing the computation burden of calculate node and memory is born Load, and the delay of data transmission is relatively low.

It should be noted that each calculate node 01 can pass through external external high speed input and output universal serial bus (English Text：Peripheral Component Interconnec Expresst；Referred to as：PCIe) channel and corresponding buffer unit 02 Connection.In PCIe buses, the equipment based on PCIe buses is known as endpoint (English：EndPoint；Referred to as：EP), EP is namely Buffer unit 02 in the embodiment of the present invention.

PCIe buses are a kind of general bus specifications, and PCIe buses are using connection mode end to end, in a PCIe The both ends of link can only respectively connect an equipment, the two equipment data sending terminal and data receiver each other.

Further, buffer unit 02 can include cache controller, and cache controller is used for buffer consistency data It is pre-processed.Optionally, cache controller is by Advanced Reduced Instruction Set machine (English：Acorn RISC Machine；Letter Claim：ARM it) is made, data interaction can be carried out between multiple ARM, each cache controller includes multiple memory bars.Such as Fig. 1 institutes Show, memory bar can be dual inline memory module (English：Dual-Inline-Memory-Modules；Referred to as：DIMM).

The pretreatment that buffer unit 02 carries out buffer consistency data can include：Correction process unpacks processing and envelope At least one of packet processing.It should be noted that the detailed process of correction process, unpacking processing and package processing can refer to The prior art, the embodiment of the present invention repeat no more this.

Specifically, arrow direction instruction calculate node 01 and caching in Fig. 1 between calculate node 01 and buffer unit 02 Subordinate relation between unit 02, i.e. calculate node 01 are master unit, and buffer unit 02 is the slave unit of calculate node 01.

Further, the 40GE/IB FDR in Fig. 1 are input/output interface 04, are used for transmission inputoutput data, 40GE/IB FDR refer to the input and output general-purpose interface of bandwidth, 40GE refer to 40Gb/s (gigabit/second) band width in physical with Too net, IB FDR refer to the infinite bandwidth interface of 56Gb/s band width in physical.LL switching matrixs unit 03 can pass through super net Card (English：Super Network Interface Card；Referred to as：SNIC) come from the defeated of input/output interface 04 to receive Enter output data.Meanwhile LL switching matrixs unit 03 can also be by inputoutput data, that is, non-caching consistency (English：Non Cache Consistency；Referred to as：NCC) data are converted to buffer consistency (English：Cache Consistency；Referred to as： CC) data so that CC data and NCC data can be transmitted to buffer unit 02 by LL buses (being identified as LL in figure).Fig. 1 In, comprising two kinds of arrows in LL switching matrixs unit 03, wherein, arrow 1 represents the flow direction of NCC data, and arrow 2 represents CC data Flow direction.

It should be noted that first, all-in-one machine provided in an embodiment of the present invention can be by buffer consistency data and input It is transmitted together in the buffer consistency system that output data is formed in buffer unit and LL switching matrixs unit, and LL switching matrixs Unit has low delay characteristic in itself, and therefore, which takes full advantage of the low delay in original buffer consistency system Characteristic；Secondly as the cache controller of buffer unit includes ARM, and direct memory access (English is integrated in ARM： Direct Memory Access；Abbreviation DMA) controller, therefore, which takes full advantage of the dma controller in ARM Data transfer mode under the control of hardware, is realized and exchanges data, Ke Yijin in batch automatically between high-speed peripheral and main memory Amount reduces the input/output operations mode of processor intervention；Again, ARM also has the characteristics that low-power consumption, therefore, the all-in-one machine Independent buffer unit has low power consumption characteristic；Finally, in order to make system in the case where surprisingly losing power supply, ensure caching number According to integrality and after system power supply recovery, field data can be restored in time, independent buffer unit can be given to add Power down protection unit, therefore, the reliability of the transmission data of the all-in-one machine are higher.

In conclusion all-in-one machine provided in an embodiment of the present invention, due to by multiple calculate nodes and multiple buffer unit phases Connection, and pass through LL switching matrixs unit and be connected respectively with multiple buffer units, input/output interface so that input and output number According to that can be buffer consistency data by LL switching matrixs cell translation, buffer consistency data be carried out using buffer unit Calculate node is transmitted to after pretreatment, therefore all-in-one machine can pass through plug-in buffer unit in the framework of buffer consistency The inputoutput data of LL switching matrix cell processings is received, so as to reduce the storage of the computation burden of calculate node, memory The delay of burden and data transmission.

The embodiment of the present invention provides another all-in-one machine, as shown in Fig. 2, the all-in-one machine includes：Multiple calculate nodes (Ser) 01, multiple buffer units 02, LL switching matrixs unit 03, input/output interface 04 and PCIe crosspoints 05.

PCIe crosspoints (English：PCIe switch；Referred to as：PCIe SW) 05 multiple 01 Hes of calculate node are connected respectively Multiple buffer units 02, for carrying out the conversion for the PCIe data that buffer consistency data are supported with multiple calculate nodes 01.It needs It is noted that PCIe crosspoints can pass through nontransparent (English：Non Transparent；Referred to as：NT) bridge is defeated to realize Enter the pond of output caching, so that multiple calculate nodes 01 share the caching in multiple buffer units 02, make calculate node No longer it is one-to-one relationship between 01 and buffer unit 02.Further, in multi-host system, each host can be with By NT bridges come in access system, and by the address translation ability of NT bridges, all hosts can access all EP, That is by NT bridges, each calculate node 01 in the embodiment of the present invention can access all buffer units 02, also It is that input into/output from cache has been subjected to pond.Specifically, the concept in pond can be illustrated by the example of server pools, one A server pools are construed as a superserver, and the resource of the server is distributed on the more clothes in all structure ponds It is engaged among device, by the unified management and operation to server pools, particularly to the equilibrium, coordination and tune of multiple servers resource Degree, can play and utilize existing computing resource to greatest extent.Therefore, multiple redundant server is integrated into one has High reliability, high extended attribute superserver process, it is possible to be known as server " pond ".

The detailed process that NT bridges technology is realized can refer to the prior art, and details are not described herein.As shown in Fig. 2, PCIe is handed over It changes comprising two kinds of arrows in unit 05, wherein, arrow 11 represents the flow direction of NCC data, and arrow 22 represents the flow direction of CC data.

LL switching matrixs unit 03 is connect respectively with multiple buffer units 02, input/output interface 04.Specifically, in Fig. 2 40GE/IB FDR for input/output interface 04, while LL switching matrixs unit 03 can by SNIC come receive come from it is defeated Enter the inputoutput data of output interface 04.

Input/output interface 04 is used for transmission inputoutput data, and buffer unit 02 is used for memory buffers consistent data, And calculate node 01, LL switching matrix lists are transmitted to by PCIe crosspoints 05 after buffer consistency data are pre-processed Member 03 is used to carry out inputoutput data and the conversion of buffer consistency data.

Each calculate node 01 can pass through external PCIe channels, PCIe crosspoints 05 and corresponding buffer unit 02 Connection.

Further, buffer unit 02 can include cache controller, and cache controller is used for buffer consistency data It is pre-processed.Cache controller can include：Field programmable gate array (English：Field-Programmable Gate Array；Referred to as：FPGA) unit, Reduced Instruction Set Computer (English：Reduced instruction set computer； Referred to as：Risc) and application-specific integrated circuit is (English：Application specific integrated circuit；Letter Claim：At least one of Asic).

Further, each cache controller can also include multiple memory bars.Exemplary, memory bar can be DIMM.

The pretreatment that buffer unit 02 carries out buffer consistency data can include：Correction process unpacks processing and envelope At least one of packet processing.

It should be noted that other explanations about Fig. 2 can be no longer superfluous herein with the description of reference implementation example a pair of Fig. 1 It states.

It should be noted that first, all-in-one machine provided in an embodiment of the present invention belongs to Distributed Calculation and caching, it is with one The mode of small service is organized to build an application, service independent operating is in different processes, and service can be disposed independently, Therefore the all-in-one machine preferably applies structure in incognito；Secondly, using the external equipment in PCIe bus couple processor systems, Inputoutput data is transmitted by buffer consistency system simultaneously, therefore, the all-in-one machine is consistent with caching by PCIe architectural frameworks Sexual system is combined, and realizes the data transmission between heterogeneous network；Again, the ARM in buffer unit is first invented into calculating The visible input into/output from cache of node increases a PCIe crosspoint so that multiple ARM can further through NT bridges to PCIE buses To invent the input into/output from cache in the visible continuation address space of calculate node, therefore, which realizes two-stage Virtualization；Finally, input into/output from cache pond has been subjected to, therefore, the calculate node of the all-in-one machine can arbitrarily obtain multiple Input into/output from cache in ARM, calculate node and buffer unit are no longer one-to-one relationships.

In conclusion all-in-one machine provided in an embodiment of the present invention, since multiple calculate nodes and multiple buffer units being led to It crosses PCIe crosspoints to be connected, and passes through LL switching matrixs unit and be connected respectively with multiple buffer units, input/output interface Connect so that inputoutput data can by LL switching matrixs cell translation be buffer consistency data, buffer consistency data Calculate node is transmitted to by PCIe crosspoints after being pre-processed using buffer unit, thus all-in-one machine can share it is defeated Enter output caching, in the framework of buffer consistency, input into/output from cache is obtained by buffer unit and PCIe crosspoints, from And the computation burden of calculate node, the storage burden of memory and the delay of data transmission are reduced, realize load balancing, and dynamic Manage resource.

The embodiment of the present invention provides another all-in-one machine, as shown in figure 3, the all-in-one machine includes：Multiple calculate nodes 01 are more A buffer unit 02, LL switching matrixs unit 03, input/output interface 04 and PCIe crosspoints 05.

As shown in Figure 2 or Figure 3, PCIe crosspoints 05 connect multiple calculate nodes 01 and multiple buffer units 02 respectively, For carrying out the conversion for the PCIe data that buffer consistency data are supported with multiple calculate nodes 01.

LL switching matrixs unit 03 is connect respectively with multiple buffer units 02, input/output interface 04.LL switching matrix lists Mem.Da instruction internal storage datas in member 03, i.e. CC data, IO Da instruction inputoutput datas, i.e. NCC data.

Input/output interface 04 is used for transmission inputoutput data, and buffer unit 02 is used for memory buffers consistent data, And calculate node 01, LL switching matrix units are transmitted to by PCIe crosspoints after buffer consistency data are pre-processed 03, for carrying out the conversion of inputoutput data and buffer consistency data.

Further, buffer unit 02 can include cache controller, and cache controller is used for buffer consistency data It is pre-processed.Cache controller can include：At least one of FPGA unit, Risc and Asic.

Each cache controller can also include multiple memory bars.Exemplary, memory bar can be that dual inline type stores Module DIMM.

Optionally, each cache controller can also include storage unit 06, and each storage unit 06 can be by serial Interface (English：Serial Advanced Technology Attachment；Referred to as：SATA it) connect, shows with buffer unit 02 Example, storage unit 06 can be hard disk drive (English：Hard Disk Drive；Referred to as：) or solid state disk (English HDD Text：Solid State Drives；Referred to as：SSD).The embodiment of the present invention can also use SBY (standby electricity) to storage unit 06 Carry out power down protection.Specifically, SATA transmits data in a manner of sequential serial, primary only to transmit 1 data, SATA buses make With embedded clock signal, there is preferable error correcting capability, transmission data and transmission instruction can be checked, if it find that Mistake can automatically correct, therefore can improve the reliability of data transmission.HDD is most basic computer memory, exemplary, such as Common computer hard disc (C disks, D disks) belongs to HDD in computer.SSD is with solid-state electronic storage chip array and manufactured hard Disk is made of control unit and storage unit, SSD in the specification and definition, function and application method of interface with common hard disc It is identical and also consistent with common hard disc in product design and size.

As shown in figure 3, the Stor.Da in PCIe crosspoints 05 represents storage data, buffer unit 02 is exchanged with PCIe Arrow 3 between unit 05 represents the flow direction of internal storage data, and arrow 4 represents the flow direction of storage data.

It should be noted that first, the storage unit point that all-in-one machine provided in an embodiment of the present invention passes through multiple low capacities Cloth is disposed, and realizes the distributed storage of data, and therefore, which has higher stability and reliability；Secondly, lead to It crosses and adds storage unit to buffer unit so that internal storage data and storage data are shunted, therefore, the plug-in storage of the all-in-one machine It is delayed relatively low.

In conclusion all-in-one machine provided in an embodiment of the present invention, due to being added to storage unit to buffer unit, caching is single Member can carry out data distribution according to data type to buffer consistency data so that calculate node can obtain memory number respectively According to storage data, so as to reduce the delay of the plug-in storage of all-in-one machine, improve the handling capacity of all-in-one machine.

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent replacement, improvement and so on should all be included in the protection scope of the present invention.

Claims

1. a kind of all-in-one machine, which is characterized in that the all-in-one machine includes：

The multiple calculate node is connect with the multiple buffer unit, the low delay LL switching matrixs unit respectively with it is described Multiple buffer units, input/output interface connection；

Wherein, the input/output interface is used for transmission inputoutput data, and the buffer unit is used for memory buffers consistency Data, and the calculate node is transmitted to after the buffer consistency data are pre-processed, the low delay LL exchanges square Array element, for carrying out the conversion of the inputoutput data and the buffer consistency data.

2. all-in-one machine according to claim 1, which is characterized in that

Each calculate node is single by external external high speed input and output universal serial bus PCIe channels and corresponding caching Member connection.

3. all-in-one machine according to claim 1, which is characterized in that the all-in-one machine includes：PCIe crosspoints,

The PCIe crosspoints connect the multiple calculate node and the multiple buffer unit respectively, described slow for carrying out Deposit the conversion for the PCIe data that consistent data is supported with the multiple calculate node.

4. according to the all-in-one machine described in claims 1 to 3 any one claim, which is characterized in that

The buffer unit includes cache controller, and the cache controller is used to locate the buffer consistency data in advance Reason.

5. all-in-one machine according to claim 4, which is characterized in that

The cache controller is made of Advanced Reduced Instruction Set machine ARM, data interaction can be carried out between multiple ARM, often A cache controller includes multiple memory bars.

6. all-in-one machine according to claim 5, which is characterized in that

The memory bar is dual inline memory module DIMM.

7. all-in-one machine according to claim 4, which is characterized in that the cache controller includes：

In on-site programmable gate array FPGA unit, Reduced Instruction Set Computer Risc and application-specific integrated circuit Asic extremely Few one kind.

8. all-in-one machine according to claim 1, which is characterized in that