CN106844263A

CN106844263A - It is a kind of based on configurable multiprocessor computer system and implementation method

Info

Publication number: CN106844263A
Application number: CN201611215355.8A
Authority: CN
Inventors: 安学军; 孙凝晖; 王展; 吴冬冬; 安仲奇
Original assignee: Chinese Academy Of Sciences State Owned Assets Management Co ltd; Institute of Computing Technology of CAS
Current assignee: Chinese Academy Of Sciences State Owned Assets Management Co ltd; Institute of Computing Technology of CAS
Priority date: 2016-12-26
Filing date: 2016-12-26
Publication date: 2017-06-13
Anticipated expiration: 2036-12-26
Also published as: CN106844263B

Abstract

Present invention proposition is a kind of to be related to Computer Architecture technical field based on configurable multiprocessor computer system and implementation method, and the system includes universal computing unit, high performance network communication interface, the fusion interconnection controller based on PCIe, I/O units；Wherein described universal computing unit accesses the fusion interconnection controller based on PCIe by the high performance network interface, the I/O units access the fusion interconnection controller based on PCIe by the PCIe interface of standard, and the I/O units are shared by the fusion interconnection controller based on PCIe by multiple universal computing units.The present invention is in the configurable multiprocessor computer system framework of efficient interconnections, realize needs to configure the quantity and mode of operation of universal computing unit, speed-up computation unit, the network equipment, high speed storing etc. according to application, and then can be with the system of constitution optimization, the power dissipation ratio of performance being optimal and the ratio of performance to price.

Description

It is a kind of based on configurable multiprocessor computer system and implementation method

Technical field

It is more particularly to a kind of to be calculated based on configurable multiprocessor the present invention relates to Computer Architecture technical field Machine system and implementation method.

Background technology

High-performance calculation has become the important of the fields such as basic scientific research, the national economic development, the development of defense-related science and technology Supplementary means, it is changing traditional research and development pattern of industry in the popularization and application of industry-by-industry, while also reversely promoting general And the development of type high-performance computer.Therefore, the development of high-performance computer system can be divided into both direction, one be towards The E grades of scalable computer system of application；Another direction is the efficient tailored version computer system of application-oriented demand customization.

At present, although the computing capability of general processor is improved constantly, the Floating-point Computation ability of uniprocessor already close to 1TFlops, is applicable to the application scenarios of most of scientific algorithms and data processing, but for some applications, its application efficiency Not high, thus even below the 5% of its peak performance brings huge energy consumption to waste.Therefore occur in that in recent years it is various towards The calculating acceleration components of specific application area, such as GPGPU, Xeon Phi, FPGA, these calculate acceleration components, and basis should mostly Optimize its architecture with feature, therefore both reached peak computational ability higher, while application-specific can be improved again Computational efficiency, and then realize power dissipation ratio of performance and the ratio of performance to price very high.In the composition of multiprocessor computer system On, current main flow structure is connected by interference networks using computing unit.Computing unit therein is general by general procedure Device or general processor connection calculate acceleration components composition, for the computing unit for calculating acceleration components, calculating and accelerating General processor management and use that part can only be connected directly；Interference networks be usually Infiniband networks, with Too net or other high performance networks.It is not enough to there is both sides in the multiprocessor computer system of said structure：On the one hand calculate Acceleration components and general processor close coupling, the utilization rate of part is not high, if other general processors in system need to make With calculating acceleration components, it is necessary to carry the data to the general processor being connected with acceleration components, then be forwarded to calculating acceleration portion Part, thus can bring extra communication overhead, reduce systematic function；On the other hand, either Infiniband, Ethernet or its His high performance network, there is an additional network protocol overhead, it is necessary to certain agreement when computing unit processes these procotols Process time, therefore communication delay can be increased, and then reduce the performance of whole multiprocessor computer system.

The content of the invention

In view of the shortcomings of the prior art, the present invention proposes a kind of based on configurable multi-processor computer architecture and realization Method.

Present invention proposition is a kind of based on configurable multiprocessor computer system, including：

Universal computing unit, high performance network communication interface, the fusion interconnection controller based on PCIe, I/O units；Wherein The universal computing unit accesses the fusion interconnection controller based on PCIe, the I/ by the high performance network interface O unit accesses the fusion interconnection controller based on PCIe by the PCIe interface of standard, and the I/O units pass through the base Shared by multiple universal computing units in the fusion interconnection controller of PCIe.

The fusion interconnection controller based on PCIe, for release the universal computing unit and the I/O units it Between tight binding, wherein the fusion interconnection controller based on PCIe by PCIe network interface with merge interconnection switch group Into；

Each described PCIe network interface is configurable interface module, comprising four kinds of functional modules：High performance network connects Mouthful controller, upstream P2P bridge, downstream P2P bridges and many I/O virtualization engines, the PCIe network interface support two kinds of works Operation mode：Host patterns, for connecting computing unit, I/O patterns are used to connect I/O units.

The application call network communication interface of the universal computing unit operation realizes the transmitting-receiving of data, the network It is connected with each other by the fusion interconnection controller based on PCIe between communication interface.

The network communication interface is by the communication runtime environment of software level and the network interface controller of hardware level Cooperative achievement.

The universal computing unit, the high performance network communication interface, the melting based on PCIe are set according to demand Close interconnection controller, the quantity of the I/O units.

The present invention also propose it is a kind of based on configurable multiprocessor computer implementation method, including：

Universal computing unit, high performance network communication interface, the fusion interconnection controller based on PCIe, I/O units are set； The universal computing unit is wherein accessed into the fusion interconnection controller based on PCIe by the high performance network interface, The I/O units are accessed into the fusion interconnection controller based on PCIe by the PCIe interface of standard, by the I/O units Shared by multiple universal computing units by the fusion interconnection controller based on PCIe.

Released between the universal computing unit and the I/O units by the fusion interconnection controller based on PCIe Tight binding, wherein the fusion interconnection controller based on PCIe by PCIe network interface with merge interconnection switch group Into；

From above scheme, the advantage of the invention is that：

The present invention based on PCIe fusion interconnection controller interconnection configurable multiprocessor computer system framework in, Realize needs to configure the quantity and work of universal computing unit, speed-up computation unit, the network equipment, high speed storing etc. according to application Operation mode, and then can be with the system of constitution optimization, the power dissipation ratio of performance being optimal and the ratio of performance to price.With following characteristics： One is that the high-performance comprising multiple general processors, I/O units is realized by the decoupling system architecture interconnected based on PCIe Computer, while supporting that the calculating according to needed for the extension of user's application-specific demand accelerates, storage accelerates or figure acceleration portion Part, with flexible configuration mode；Two is, using minimum protocol layers and high performance network communication interface technique, to realize High-performance user level communication between multiple computing units；Three is virtualization technology of sharing, realizes direct I/O virtualizations, is led to Shared I/O parts (including accelerator) of multiple computing units is crossed, can balance universal computing unit needs to the dynamic of I/O parts Ask, reduce the quantity of I/O parts, the overall power of multiprocessor computer system is reduced while improving I/O resource utilizations And cost.

Brief description of the drawings

Fig. 1 is the schematic diagram of the configurable multiprocessor computer system one embodiment of the present invention；

Fig. 2 is the block architecture diagram of the fusion interconnection controller one embodiment based on PCIe；

Fig. 3 is interconnecting communication system schematic diagram between the computing unit of networking；

Fig. 4 is that multiple root units share an I/O schematic diagram；

Fig. 5 is that many PCIe based on ID labelling methods exchange schematic diagram；

Fig. 6 is the host interconnection schematic diagram based on ID labelling methods；

Specific embodiment

In order that the purpose of the present invention, technical scheme become more apparent, specific embodiment of the invention is given below, With reference to drawings and Examples, the present invention will be described in further detail, it will be appreciated that specific embodiment described herein is only It is used to explain the present invention, is only to show the part relevant with the present invention rather than according to actual reality in specific embodiment of the invention figure Part count, shape and size when applying, the present invention can also be embodied or practiced by different specific embodiments, this Every details of specification can also carry out various modifications or change under without departing from spirit of the invention.

It is entitled " a kind of to be exchanged based on PCIe data in the granted patent " CN103117929A " of present invention applicant In the Chinese invention patent of communication means and system ", a kind of communication means exchanged based on PCIe data and system are disclosed, should Method includes：Start PCIe switch, and pair processor communicated with the PCIe switch and PCIe terminals are carried out The equipment search and configuration of PCIe；The processor or PCIe terminals send according to routing iinformation to the PCIe switch port PCIe read-write requests, the port can using bag form and compatibility standard the PCIe route of compatibility standard PCIe link layer protocol The PCIe read-write requests are configured to packet, and send it to corresponding ports by extension routing mode；The corresponding ports The packet is reduced to PCIe read-write requests, and sends it to processor or PCIe terminals.It realizes a kind of PCIe Data Interchange Technology, releases the topological sum route limitation of PCIe buses so that PCIe buses are real while expansion I/O equipment Communication between existing multiprocessor, builds the expansible interference networks of arbitrary topology.The present invention is the data based on the PCIe A kind of configurable multiprocessor computer system framework and implementation method that the method for communication is proposed.

Traditional multiprocessor computer system is one group of combination of computing unit, and each computing unit is by fixed number The resource close-coupleds such as general-purpose computations part, speed-up computation part, internal memory, I/O are formed, to realize configuring many according to application demand Different number of components in processor computer system, improve reduces cost, work(while calculating integration density and resource utilization Consumption and floor space, the present invention propose a kind of based on configurable multiprocessor computer system and implementation method, the system Relieve the tight binding between general-purpose computations and the tight binding and general-purpose computations and I/O equipment of speed-up computation.

Fig. 1 describes an embodiment schematic diagram of the configurable multiprocessor computer system framework of the present invention.Should Framework is mainly made up of following several parts：Universal computing unit, high performance network communication interface HiPNI, melting based on PCIe Interconnection controller (as shown in Figure 2) and I/O units are closed, wherein the fusion interconnection controller based on PCIe, can release general Calculate and the tight binding between the tight binding and general-purpose computations and I/O equipment of speed-up computation, and expand by based on PCIe The data communications method of exhibition, realizes efficient totally interconnected between computing unit and I/O units and between computing unit；General meter It is a complete computer hardware system to calculate unit, including processor, independent main memory and storage hard disk etc., can run independence Operating system；Universal computing unit accesses the fusion interconnection controller based on PCIe, Jin Ershi by high performance network interface High-performance interconnection communication between existing multiple universal computing unit；I/O units can be speed-up computation part (such as GPGPU/Xeon Phi/FPGA etc.), network I/O, high speed storing I/O (SSD) or graphics acceleration card etc., connected by the PCIe interface of standard To the fusion interconnection controller based on PCIe, the PCIe connections between universal computing unit and I/O units are realized；Uncoupled I/O Unit can be shared by the fusion interconnection controller based on PCIe by multiple universal computing units, it is expressly noted that this hair The bright quantity not to all parts in system does any limitation, can be according to application demand flexible configuration.

The interconnection communication that it is critical only that after various Resource Units separation of the decoupled systems structure.PCIe protocol is current The computer I/O bus protocol interfaces of most main flow, increase income with high bandwidth, low delay characteristic, and agreement, and many processors are all PCIe interface is integrated with, such as general X 86 processor, ARM, Xeon Phi many-core processors and GPGPU, in traditional meter In calculating unit system structure, speed-up computation unit (such as Xeon Phi, GPGPU, FPGA etc.) and (such as network interface control of I/O units Device processed, disk adapter etc.) all it is to be directly connected to by PCIe interface and general processor, by the use of PCIe as each after decoupling Interconnection agreement between Resource Unit, can be with simplified communication protocol level and system architecture, there is provided it is same that high-performance interconnection communicates When ensure good compatibility and configuration flexibility, system can neatly, modularly be configured not according to application-specific demand With several amount and type of unit, for example, configure the computing unit enhancing general-purpose computations energy of all-purpose processor (such as Intel Xeon) Power, or extension isomery speed-up computation part (such as GPGPU/Xeon Phi) is improved and calculates density, or is configured high performance Graphics accelerator, lifts the view effect of specialty, or configuration high-performance storage, improves data-handling capacity of system etc..

But, the PCIe of standard is for the interconnection between single system and its I/O equipment, using tree structure, connection The IOH of processor is root, and I/O equipment is leaf, and processor and I/O have master slave relation, different root systems system (computing unit) The different PCIe domains of correspondence, it is separate between different PCIe domains, it is impossible to direct communication, in order to solve this problem, this hair Bright proposition networking computing unit, and the interconnection agreement by being extended based on PCIe, realize the high performance communication between computing unit.

Fig. 2 is the block architecture diagram of the fusion interconnection controller one embodiment based on PCIe, the fusion interconnection based on PCIe Controller is made up of PCIe network interface (PCIe NI) and fusion interconnection switch.

Each PCIe network interface is a configurable interface module, comprising four kinds of functional modules：High performance network interface Controller, upstream P2P bridges (uP2P), downstream P2P bridges (dP2P) and many I/O virtualization engines (MRIOV engines), PCIe nets Network interface is configurable to support two kinds of mode of operations：Host patterns (for connecting computing unit) and I/O patterns are (for connecting I/O Unit).

If PCIe network interface is configured as Host patterns, high performance network interface controller and upstream in interface P2P bridges are enabled.High performance network interface controller and upstream P2P bridges are shared same as two functions in a PCIe device One physics PCIe link, is connected with computing unit, wherein, PCIe network where high performance network interface controller realizes it connects Interconnection peer-to-peer communications between the computing unit and other computing units of mouth connection；Under upstream P2P bridges and I/O patterns in PCIe port Downstream P2P bridge constitute PCIe Switch, realize the interconnection master-slave communication of computing unit and I/O units.

If PCIe network interface is configured as I/O patterns, the several downstream P2P bridges of computing unit and Duo Gen I/O are virtualized Engine is enabled.Multiple downstream P2P bridges share a physics PCIe link, are connected with I/O units, wherein, downstream P2P bridges with Upstream P2P bridges under Host patterns in PCIe port constitute PCIe Swtich and realize I/O units with exchanging between computing unit. Many I/O virtualization engines realize that multiple computing units are carried out to the physics I/O units of the PCIe network interface connection where it Straight-through ground, on demand dynamic are shared, and for the shared I/O resources of many computing units provide isolation and protect.

The order of communication request and response bag between in order to keep unit, it is to avoid deadlock is sent, while not consuming especially Handling capacity is improved under conditions of expense logical resource, the PCIe fusion interconnection switches in figure can be designed as follows：Use 4 Parallel cross bar switch exchanges PCIe peer-to-peer communications affairs and PCIe master-slave communication affairs respectively；The input of each cross bar switch Port uses two Virtual Channels, to reduce hol blocking；Even number sequence number Virtual Channel only buffers the network bag that go out port numbers are even number, Odd indexed passage only buffers the network that go out port numbers are odd number.

As shown in Fig. 2 computing unit accesses the fusion interconnection controller based on PCIe by PCIe network interface, calculate single High performance network interface controller in the application call PCIe network interface of unit's operation realizes the transmitting-receiving of data, and property high Can be connected with each other by the fusion interconnection switch based on PCIe between network interface controller, and then realized in system between many main frames High performance communication, Fig. 3 describes interconnecting communication system schematic diagram between the computing unit of networking, and the network communication interface is by soft Communication runtime environment (the CRT of part level：Communication Run-time) and hardware level high performance network interface Controller (HiP NIC：High Performance Network Interface Controller) cooperative achievement, it is not only The data interactive mode of communication software and the network hardware is defined, is also responsible for realizing the traffic model of system, wherein, software level Communication runtime environment on the one hand operate and manage the various communication resources (including core buffer and network interface hardware resource Deng), the highly reliable underlying user level communication pool of low overhead is realized, on the other hand based on basic communication protocol, encapsulation is realized different Traffic model, be using the DLL of provides convenient；The high performance network interface controller of hardware level is in structure User-level communication provides support, and realizes RDMA functions based on user-level communication protocol, and is interconnected by the fusion based on PCIe High-performance data exchanges transmission between controller realizes many main frames.

Integration density and I/O resource utilizations are calculated in order to improve, overall power, cost and the occupation of land for reducing system are empty Between, the method that the present invention proposes decoupling each computing unit and I/O equipment, the shared I/O of all of computing subsystem in system Resource, the I/O number of devices of redundancy in system can be reduced by efficient I/O resource-sharings.

However, I/O equipment commercial at present can only be used by a root unit (i.e. computing unit), a master can only be received The oRID0 (Bus/Device/Function ID) of root unit configuration (as shown in Figure 4) is controlled, can only be by BAR addresses MMIO (Memory Mapped I/O) is mapped to a memory headroom for main control root unit, obtains memory address oADDR0, when When multiple computing units are initiated same I/O equipment configuration, use, main control root computing unit 0 as shown in Figure 4 and user Computing unit 1, can cause I/O equipment behaviors to obscure, or even system crash, and to solve this problem, the present invention uses a kind of hard The many I/O virtualization technology of sharing of part auxiliary, can not change the situation of system hardware and software framework and I/O device drives Under, to realize that single I/O equipment is dynamically directly found, configures and shared by multiple root units and use, it is many that the hardware is aided in Root I/O virtualizations technology of sharing may refer to the patent that grant number is CN103353861A and " realize distribution I/O resource pools Method and device ", but the method for being not limited to the patent.

As described above, enabling the interconnection communication that it is critical only that after various Resource Units separation of decoupled systems structure, it is The high-performance interconnection of multiple computing units is realized, the present invention proposes a kind of interconnection agreement that PCIe is extended by ID labelling methods.

Fig. 5 gives the host interconnection schematic diagram based on PCIe extensions, the RDMA that high performance network interface controller sends Request is packaged into PCIe TLP bags, and purpose computing unit mark (dstCNID), so, band are stamped in RDMA request TLP bags The RDMA request TLP bags for having dstCNID can exchange to purpose computing unit by merging interconnection switch interconnection, and then realize Interconnection between multiple computing units.

Fig. 6 gives the schematic diagram that many PCIe are exchanged, and the uP2P connected with computing unit is configured corresponding CNID, when When I/O functions of the equipments (F1 in such as Fig. 6) distribute to user's computing unit (computing unit 1 in such as Fig. 6), with the I/O Elementary Functions The dP2P of connection will be configured corresponding CNID, and so, a uP2P and multiple dP2P with identical CNID are computing unit CNID constitutes a virtual PCIe switch, and the PCIe transaction that computing unit is initiated will be labeled with the CNID of oneself and marks through uP2P Know, the global unified RID and MMIO addresses being made up of CNID and the I/O unit R ID or MMIO addresses to be accessed, can be with Addressed exchange is to unique correct I/O Elementary Functions；In reverse direction, the PCIe transaction that I/O Elementary Functions are initiated will by dP2P The CNID marks of its affiliated computing unit are labeled with, reverse computing unit addressing is realized.Extension is so identified by CNID PCIe logic isolations difference PCIe domains, the unit with identical ID marks belongs to same PCIe domains, extends PCIe protocol energy Enough on the basis of any protocol conversion expense is not increased, PCIe transaction between multiple computing units and multiple I/O equipment is realized Interconnection exchange, while be between many root units and I/O Elementary Functions PCIe transaction exchange insulation blocking is provided.

Claims

1. a kind of based on configurable multiprocessor computer system, it is characterised in that including：

Universal computing unit, high performance network communication interface, the fusion interconnection controller based on PCIe, I/O units；It is wherein described Universal computing unit accesses the fusion interconnection controller based on PCIe by the high performance network interface, and the I/O is mono- Unit accesses the fusion interconnection controller based on PCIe by the PCIe interface of standard, and the I/O units are based on by described The fusion interconnection controller of PCIe is shared by multiple universal computing units.

2. it is as claimed in claim 1 based on configurable multiprocessor computer system, it is characterised in that described based on PCIe Fusion interconnection controller, for releasing the tight binding between the universal computing unit and the I/O units, wherein described Fusion interconnection controller based on PCIe is made up of PCIe network interface with interconnection switch is merged；

Each described PCIe network interface is configurable interface module, comprising four kinds of functional modules：High performance network interface control Device processed, upstream P2P bridge, downstream P2P bridges and many I/O virtualization engines, the PCIe network interface support two kinds of Working moulds Formula：Host patterns, for connecting computing unit, I/O patterns are used to connect I/O units.

3. it is as claimed in claim 1 based on configurable multiprocessor computer system, it is characterised in that the general-purpose computations The application call network communication interface of unit operation is realized passing through the base between the transmitting-receiving of data, the network communication interface It is connected with each other in the fusion interconnection controller of PCIe.

4. it is as claimed in claim 3 based on configurable multiprocessor computer system, it is characterised in that the network service Interface is by the communication runtime environment of software level and the network interface controller cooperative achievement of hardware level.

5. it is as claimed in claim 1 based on configurable multiprocessor computer system, it is characterised in that to set according to demand The universal computing unit, the high performance network communication interface, the fusion interconnection controller based on PCIe, the I/O The quantity of unit.

6. a kind of based on configurable multiprocessor computer implementation method, it is characterised in that including：

Universal computing unit, high performance network communication interface, the fusion interconnection controller based on PCIe, I/O units are set；Wherein The universal computing unit is accessed into the fusion interconnection controller based on PCIe by the high performance network interface, by institute State I/O units and the fusion interconnection controller based on PCIe is accessed by the PCIe interface of standard, the I/O units are passed through The fusion interconnection controller based on PCIe is shared by multiple universal computing units.

7. it is as claimed in claim 6 based on configurable multiprocessor computer implementation method, it is characterised in that by described Fusion interconnection controller based on PCIe releases the tight binding between the universal computing unit and the I/O units, wherein The fusion interconnection controller based on PCIe is made up of PCIe network interface with interconnection switch is merged；

8. it is as claimed in claim 6 based on configurable multiprocessor computer implementation method, it is characterised in that described general The application call network communication interface of computing unit operation is realized passing through institute between the transmitting-receiving of data, the network communication interface The fusion interconnection controller based on PCIe is stated to be connected with each other.

9. it is as claimed in claim 8 based on configurable multiprocessor computer implementation method, it is characterised in that the network Communication interface is by the communication runtime environment of software level and the network interface controller cooperative achievement of hardware level.

10. it is as claimed in claim 6 based on configurable multiprocessor computer implementation method, it is characterised in that according to need Ask the setting universal computing unit, the high performance network communication interface, the fusion interconnection controller based on PCIe, institute State the quantity of I/O units.