CN105005361A - High-performance-computation-oriented novel processing unit and computer system architecture - Google Patents

High-performance-computation-oriented novel processing unit and computer system architecture Download PDF

Info

Publication number
CN105005361A
CN105005361A CN201510375045.1A CN201510375045A CN105005361A CN 105005361 A CN105005361 A CN 105005361A CN 201510375045 A CN201510375045 A CN 201510375045A CN 105005361 A CN105005361 A CN 105005361A
Authority
CN
China
Prior art keywords
novel process
process unit
minimum logical
unit
processing unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510375045.1A
Other languages
Chinese (zh)
Inventor
王磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201510375045.1A priority Critical patent/CN105005361A/en
Publication of CN105005361A publication Critical patent/CN105005361A/en
Pending legal-status Critical Current

Links

Landscapes

  • Multi Processors (AREA)

Abstract

The invention provides a high-performance-computation-oriented novel processing unit and a high-performance-computation-oriented computer system architecture. The novel processing unit comprises more than two minimum logical units, internal storage controllers and internal storage expanders, wherein the minimum logical units are sequentially connected; each of the two sides of every two adjacent minimum logical units is connected with the corresponding internal storage controller corresponding to the two adjacent minimum logical units; each minimum logical unit comprises two mutually connected computing processing units; the internal storage expanders are connected with the two minimum logical units which are positioned at the head and the tail; all internal storage controllers positioned at the same side of each minimum logical unit are connected; and the connection of the computing processing units arranged in any two minimum logical units is realized. According to the scheme, modules except for data computation modules in the novel processing unit are omitted, so that more computing processing units can be integrated in the novel processing unit, and the data computation capability is improved; and the internal storage controllers are used to be combined with the internal storage expanders, so that the storage space of the novel processing unit is enlarged.

Description

A kind of novel process unit towards high-performance calculation and computer body system structure
Technical field
The present invention relates to field of computer technology, particularly a kind of novel process unit towards high-performance calculation and computer body system structure.
Background technology
Along with the development of the technology such as cloud computing, large data, the requirement of user to the processing power of computer architecture is more and more higher.
Traditional computer system comprises processing unit, core data crosspoint and management crosspoint.Wherein, several calculation processing unit utilizing interconnection to be connected and I/O controllers etc. can be comprised in processing unit, for carrying out core calculations to computer architecture.
But, 20 calculation processing units can be comprised at most for traditional processing unit, cause computer body to tie up to when calculating high density, high concurrent data and there is certain limitation and deficiency, therefore, be badly in need of providing a kind of novel process unit towards high-performance calculation, to carry out the calculating of high-performance, high-density.
Summary of the invention
In view of this, the invention provides a kind of novel process unit towards high-performance calculation and computer body system structure, to carry out the calculating of high-performance, high-density.
Embodiments provide a kind of novel process unit towards high-performance calculation, comprising:
The plural minimum logical block be connected successively, often the both sides of each and every one minimum logical block adjacent are connected to the storage inside controller corresponding to these adjacent two minimum logical blocks; Described minimum logical block comprises two interconnective calculation processing units; The storage inside extender be connected respectively with two the minimum logical blocks being positioned at head and the tail;
Wherein, each storage inside controller being positioned at each minimum logical block the same side is connected, and realizes the connection of the calculation processing unit in any two described minimum logical blocks.
Preferably, described novel process unit comprises 72 calculation processing units.
Preferably, comprise further: PCIe extender, for externally providing the expansion of 36 road PCI ExpressGen3.0.
Preferably, comprise further: equipment manager, for providing DMI bus, to connect PCH chipset.
The embodiment of the present invention additionally provides a kind of computer body system structure towards high-performance calculation, comprises core data crosspoint, management crosspoint and above-mentioned novel process unit.
Preferably, described novel process unit adopts the link of PCIe x16 externally to communicate, and wherein, transmission bandwidth is not more than 128Gb/s, and two-way bandwidth is not more than 256Gb/s.
Preferably, described core data crosspoint internally connects described novel process unit by the link of 100Gb/s transfer rate, externally provides the uplink communication interface of at least one 100Gb/s.
Preferably, management crosspoint, for the running status of modules in supervisory control comuter system, wherein, described running status comprises temperature and voltage.
Embodiments provide a kind of novel process unit towards high-performance calculation and computer body system structure, save the module except the module for calculating data in novel process unit, such as, I/O controller, so that make in novel process unit can more integrated calculation processing unit, to improve data computing power; Adopt storage inside controller to combine with storage inside extender, thus expand the storage space of novel process unit.
Accompanying drawing explanation
Fig. 1 is the novel process cell schematics that the embodiment of the present invention provides;
Fig. 2 is the computer body system structure simple diagram that the embodiment of the present invention provides;
Fig. 3 is the computer body system structure details drawing that the embodiment of the present invention provides.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described.Obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
As shown in Figure 1, embodiments provide a kind of novel process unit towards high-performance calculation, can comprise: the plural minimum logical block be connected successively, often the both sides of each and every one minimum logical block adjacent are connected to the storage inside controller corresponding to these adjacent two minimum logical blocks; Described minimum logical block comprises two interconnective calculation processing units; The storage inside extender be connected respectively with two the minimum logical blocks being positioned at head and the tail;
Wherein, each storage inside controller being positioned at each minimum logical block the same side is connected, and realizes the connection of the calculation processing unit in any two described minimum logical blocks.
According to the above-mentioned novel process unit towards high-performance calculation, save the module except the module for calculating data in novel process unit, such as, I/O controller, so that make in novel process unit can more integrated calculation processing unit, to improve data computing power; Adopt storage inside controller to combine with storage inside extender, thus expand the storage space of novel process unit.
In the present embodiment, novel process unit in Fig. 1 designs towards high-performance calculation, in a preferred embodiment of the invention, 72 calculation processing units can be comprised in 1 novel process unit, come the computing of responsible data and the process of instruction, wherein, the highest support of each calculation processing unit 4 concurrent thread, 1 novel process unit supports at most 284 threads, thus can meet the demand of performance application to high concurrent data.
In a preferred embodiment of the invention, in novel process unit, adopt the design that storage inside controller and storage inside extender combine, comprise 2 storage inside extenders at processing unit, as shown in Figure 1.
Wherein, the internal memory of DDR42400MHz type supported by storage inside extender, the highest space addressing function supporting 384GB capacity.In the present embodiment, the effect of storage inside extender can comprise following two effects: 1, for expanding the memory headroom of novel process unit.2, because the die size of novel process unit is fixed, therefore, processor etc. can be inserted in storage inside extender, to expand the performance of novel process unit.
Wherein, storage inside controller, store for the application data controlled in novel process unit, wherein, as shown in Figure 1, each storage inside controller is connected with a storer, and this storer can be DRAM (Dynamic Random Access Memory, i.e. dynamic RAM).Storage inside controller controls application data store in storer.
In the present embodiment, the application data that can will commonly use in novel process unit, namely " dsc data " is stored in the internal storage of novel process unit, to ensure, when needs dsc data, can extract from internal storage as early as possible.The application data that can seldom will use in novel process unit, in the external memory that the storage inside extender that namely " cold data " are stored into novel process unit is expanded.According to Fig. 1, because the minimum logical block of storage inside extender with the head and the tail being positioned at the minimum logical block be connected successively is connected, therefore, if the minimum logical block except head and the tail is when the cold data of needs, the process extracting cold data has some to postpone a little.
In a preferred embodiment of the invention, can also in novel process unit integrated 1 PCIe extender, as shown in Figure 1, the expansion of the PCI Express Gen3.0 of 36lanes (road) is externally provided, can beyond the equipment of chain Based PC Ie communication, the expanded function of expansion new computer system.
In a preferred embodiment of the invention, can also in novel process unit an integrated equipment manager, as shown in Figure 1, wherein, this equipment manager is used for providing DMI (Direct Media InterfaceI, direct media interface) bus, is used for connecting general PCH (the Platform Controller Hub of industry, integrated south bridge) chipset, the compatibility of computer body system structure to industry mainstream chip can be strengthened.
As shown in Figure 2, the embodiment of the present invention additionally provides a kind of computer body system structure towards high-performance calculation, and wherein, this computer architecture comprises the novel process unit in core data crosspoint, management crosspoint and above-described embodiment.
In the present embodiment, as shown in Figure 2, novel process unit adopts the link of PCIe x16 externally to communicate, and wherein, transmission bandwidth is not more than 128Gb/s, and two-way bandwidth is not more than 256Gb/s.
In the present embodiment, core data crosspoint internally goes to connect the novel process unit in computer architecture by the link of 100Gb/s, the high-speed communication based on 100Gb/s transfer rate between novel process unit can be realized, the uplink communication interface of multiple 100Gb/s equally externally can be provided, the data access in computer system between modules can be realized, the scalability of whole high density counting system framework can be strengthened.
In the present embodiment, administrative unit is responsible for the running status monitoring each module in whole computer architecture, comprises the monitoring of temperature, voltage.Consider that whole computer architecture can adopt 2 administration modules from redundance, when 1 administration module loses efficacy, another administration module can adapter work at once, in guarantee architecture system, supervising the network is stable, as shown in Figure 3, is the details drawing of computer body system structure.
In a preferred embodiment of the invention, in computer architecture, each computing unit can be made up of 2 sub-computing units, according to sub-computing unit in system physical position spatially, be divided into computing unit (Upper Server) and lower computing unit (Lower Server).Every sub-computing unit forms primarily of novel process unit, processing and control element (PCE), network element, administrative unit, internal storage data storage unit and control operation unit.
Wherein, novel process unit, as the core cell on sub-computing unit, every sub-computing unit is designed with 1 novel process unit, and the design of novel process unit structure is as shown in above-described embodiment.
Wherein, processing and control element (PCE), is connected with novel process unit by DMI bus, receives the instruction sent by novel process unit, to the management that the I/O communication apparatus in control module is concentrated.2 disks can be comprised in computer architecture, as storage unit, storage unit is mainly used to store novel process unit and infrequently processes the file large with data capacity or data, be connected by SATA signal between storage unit with processing and control element (PCE), can RAID0 or RAID1 be set up by processing and control element (PCE) 2 storage unit, meet the demand of client to data high speed access or redundancy.
Wherein, administrative unit can be connected with processing and control element (PCE) by the link of PCIe 2.0x1, and administrative unit can adopt ASP2400 family chip group, is used for the monitoring of all device temperatures in responsible computer for controlling system, voltage.Administrative unit draws the SGMII signal connection management crosspoint of two 1Gbit/s respectively by transmission unit, management crosspoint is responsible for the management information gone up in collection 1 computing unit on computing unit (Upper Server) and lower computing unit (Lower Server), then unification is connected with the switched network management network in computer body system structure by I/O unit 2, user can obtain or monitor the running status of all computing units in whole computer architecture by switched network management network, also can give each computing unit Distribution Calculation task simultaneously.
Wherein, the data link design of application extension unit Based PC Ie Gen3.0x16, the maximum transmission bandwidth that 100Gb/s can be provided.Application extension unit is directly connected with novel process unit, more directly can be sent in novel process unit by point-to-point for the application data of reality, avoid the problem such as loss of data, transmission delay caused during data multicast communication, be more suitable for the requirement of the many high bandwidths of performance application, the transmission of low delayed data.Application extension unit adopts based on modular design concept simultaneously, the transmission application of the current industry main flow such as Ethernet, Infiniband, FC and iSCSI can be supported, and whole new computer architectural framework can be realized by different modules to the support of different application, and improve the dirigibility of new system framework expansion further.1 application extension unit is designed with in 1 novel high-density computer calculate unit, be connected with the novel process unit in computing unit (Upper Server) upper in each computing unit and lower computing unit (Lower Server) respectively, be connected with the core data exchange network in computer body system structure by I/O unit 1, core data unit mainly bears the calculation task in whole framework.
In sum, the embodiment of the present invention at least can realize following beneficial effect:
1, save the module except the module for calculating data in novel process unit, such as, I/O controller, so that make in novel process unit can more integrated calculation processing unit, to improve data computing power; Adopt storage inside controller to combine with storage inside extender, thus expand the storage space of novel process unit.
2, PCIe extender is integrated in novel process unit, can beyond the equipment of chain Based PC Ie communication, the expanded function of expansion new computer system.
3, equipment manager is integrated in novel process unit, the compatibility of computer body system structure to industry mainstream chip can be strengthened.
The content such as information interaction, implementation between each unit in the said equipment, due to the inventive method embodiment based on same design, particular content can see in the inventive method embodiment describe, repeat no more herein.
It should be noted that, in this article, the relational terms of such as first and second and so on is only used for an entity or operation to separate with another entity or operational zone, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.And, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or equipment.When not more restrictions, the key element " being comprised " limited by statement, and be not precluded within process, method, article or the equipment comprising described key element and also there is other same factor.
One of ordinary skill in the art will appreciate that: all or part of step realizing said method embodiment can have been come by the hardware that programmed instruction is relevant, aforesaid program can be stored in the storage medium of embodied on computer readable, this program, when performing, performs the step comprising said method embodiment; And aforesaid storage medium comprises: ROM, RAM, magnetic disc or CD etc. various can be program code stored medium in.
Finally it should be noted that: the foregoing is only preferred embodiment of the present invention, only for illustration of technical scheme of the present invention, be not intended to limit protection scope of the present invention.All any amendments done within the spirit and principles in the present invention, equivalent replacement, improvement etc., be all included in protection scope of the present invention.

Claims (8)

1., towards a novel process unit for high-performance calculation, it is characterized in that, comprising:
The plural minimum logical block be connected successively, often the both sides of each and every one minimum logical block adjacent are connected to the storage inside controller corresponding to these adjacent two minimum logical blocks; Described minimum logical block comprises two interconnective calculation processing units; The storage inside extender be connected respectively with two the minimum logical blocks being positioned at head and the tail;
Wherein, each storage inside controller being positioned at each minimum logical block the same side is connected, and realizes the connection of the calculation processing unit in any two described minimum logical blocks.
2. novel process unit according to claim 1, is characterized in that, described novel process unit comprises 72 calculation processing units.
3. novel process unit according to claim 1, is characterized in that, comprise further: PCIe extender, for externally providing the expansion of 36 road PCI Express Gen3.0.
4. novel process unit according to claim 1, is characterized in that, comprise further: equipment manager, for providing DMI bus, to connect PCH chipset.
5. towards a computer body system structure for high-performance calculation, it is characterized in that, to comprise in core data crosspoint, management crosspoint and the claims any one novel process unit in 1-4.
6. novel process unit according to claim 5, is characterized in that, described novel process unit adopts the link of PCIe x16 externally to communicate, and wherein, transmission bandwidth is not more than 128Gb/s, and two-way bandwidth is not more than 256Gb/s.
7. computer body system structure according to claim 5, is characterized in that, described core data crosspoint internally connects described novel process unit by the link of 100Gb/s transfer rate, externally provides the uplink communication interface of at least one 100Gb/s.
8. computer body system structure according to claim 5, is characterized in that, management crosspoint, for the running status of modules in supervisory control comuter system, wherein, described running status comprises temperature and voltage.
CN201510375045.1A 2015-07-01 2015-07-01 High-performance-computation-oriented novel processing unit and computer system architecture Pending CN105005361A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510375045.1A CN105005361A (en) 2015-07-01 2015-07-01 High-performance-computation-oriented novel processing unit and computer system architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510375045.1A CN105005361A (en) 2015-07-01 2015-07-01 High-performance-computation-oriented novel processing unit and computer system architecture

Publications (1)

Publication Number Publication Date
CN105005361A true CN105005361A (en) 2015-10-28

Family

ID=54378062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510375045.1A Pending CN105005361A (en) 2015-07-01 2015-07-01 High-performance-computation-oriented novel processing unit and computer system architecture

Country Status (1)

Country Link
CN (1) CN105005361A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020032877A1 (en) * 2000-08-31 2002-03-14 Tsutomu Iwaki Graphics controller and power management method for use in the same
CN1848065A (en) * 2006-05-12 2006-10-18 华中科技大学 Isomeric double-system bus objective storage controller
CN102929363A (en) * 2012-10-25 2013-02-13 浪潮电子信息产业股份有限公司 Design method of high-density blade server
CN103970214A (en) * 2014-05-19 2014-08-06 浪潮电子信息产业股份有限公司 Heterogeneous acceleration blade type computer system architecture
CN104035525A (en) * 2014-06-27 2014-09-10 浪潮(北京)电子信息产业有限公司 Computational node
CN104408014A (en) * 2014-12-23 2015-03-11 浪潮电子信息产业股份有限公司 System and method for interconnecting processing units of calculation systems

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020032877A1 (en) * 2000-08-31 2002-03-14 Tsutomu Iwaki Graphics controller and power management method for use in the same
CN1848065A (en) * 2006-05-12 2006-10-18 华中科技大学 Isomeric double-system bus objective storage controller
CN102929363A (en) * 2012-10-25 2013-02-13 浪潮电子信息产业股份有限公司 Design method of high-density blade server
CN103970214A (en) * 2014-05-19 2014-08-06 浪潮电子信息产业股份有限公司 Heterogeneous acceleration blade type computer system architecture
CN104035525A (en) * 2014-06-27 2014-09-10 浪潮(北京)电子信息产业有限公司 Computational node
CN104408014A (en) * 2014-12-23 2015-03-11 浪潮电子信息产业股份有限公司 System and method for interconnecting processing units of calculation systems

Similar Documents

Publication Publication Date Title
US9756128B2 (en) Switched direct attached shared storage architecture
US9898427B2 (en) Method and apparatus for accessing multiple storage devices from multiple hosts without use of remote direct memory access (RDMA)
US10452316B2 (en) Switched direct attached shared storage architecture
CN104657316B (en) Server
CN103116661B (en) A kind of data processing method of database
CN106066890B (en) Distributed high-performance database all-in-one machine system
CN105117170A (en) Computer system architecture
CN102833237B (en) InfiniBand protocol conversion method and system based on bridging
CN103336745A (en) FC HBA (fiber channel host bus adapter) based on SSD (solid state disk) cache and design method thereof
CN101996139A (en) Data matching method and data matching device
CN104135514B (en) Fusion type virtual storage system
CN101650639A (en) Storage device and computer system
CN104967577B (en) SAS switch and server
CN105138494B (en) A kind of multichannel computer system
CN104317770A (en) Data storage structure and data access method for multiple core processing system
CN101639811A (en) Data writing method, controller and multi-controller system
TWI767111B (en) Sever system
WO2019171176A1 (en) Power management for solid state drives in a network
US10318474B1 (en) Data storage system with heterogenous parallel processors
CN108984309A (en) A kind of RACK server resource pond system and method
CN106020737A (en) Globally-shared-disc high-density storage frame system
CN104102301A (en) 2U (unit) ultrahigh-density storage server
CN105224496A (en) A kind of flow for dynamic reconfigurable system
CN104461396A (en) Distribution type storage expansion framework based on fusion framework
CN105653213A (en) Double control disk array based on Freescale P3041

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20151028