CN102902655A - Information processing heterogeneous system - Google Patents
Information processing heterogeneous system Download PDFInfo
- Publication number
- CN102902655A CN102902655A CN2012103392335A CN201210339233A CN102902655A CN 102902655 A CN102902655 A CN 102902655A CN 2012103392335 A CN2012103392335 A CN 2012103392335A CN 201210339233 A CN201210339233 A CN 201210339233A CN 102902655 A CN102902655 A CN 102902655A
- Authority
- CN
- China
- Prior art keywords
- cpu
- mic
- information processing
- chip
- performance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention relates to an information processing heterogeneous system. The system comprises a central processing unit (CPU) platform, at least one many integrated core (MIC) chip and a connector, wherein the CPU platform comprises a CPU chip, and the connector is used for connecting the MIC chip with the CPU platform. The system has the advantages that system performances can be improved effectively, and high-performance application requirements can be satisfied.
Description
Technical field
The present invention relates to high-performance computing sector, be specifically related to a kind of information processing heterogeneous system.
Background technology
High-performance calculation is the forward position hi-tech of message area, safeguarding national security, promote the science and techniques of defence progress, promoting to have direct impetus aspect the sophisticated weapons development, is one of important symbol of weighing a national comprehensive strength.Develop rapidly along with informationized society, human more and more higher to the requirement of information processing capability, the demand high-performance calculations such as not only petroleum prospecting, weather forecast, space flight national defence, scientific research, and finance, e-government, education, enterprise, online game etc. widely the field to the demand rapid growth of high-performance calculation.
Computing velocity is particularly important for high-performance calculation, high-performance calculation will be towards multinuclear, many nuclear development, adopt the parallel computation speed that promotes of isomery, CPU+GPU is the very ripe collaborative computation schema of isomery at present, but because there is huge challenge in GPU on programming efficiency, fine granularity parallel algorithm, large-scale parallel performance.
Summary of the invention
The technical problem to be solved in the present invention provides a kind of information processing heterogeneous system, the low problem of system performance when using with the solution high-performance calculation.
For solving the problems of the technologies described above, the invention provides a kind of information processing heterogeneous system, this system comprises:
A central processing unit (CPU) platform, described platform comprises cpu chip;
At least one integrated many nuclear (MIC) chip;
Connector is used for connecting described MIC chip to described CPU platform.
The present invention also provides another kind of information processing heterogeneous system, and this system comprises:
The first performance element, its processor are realized by 2 cpu chips, are used for carrying out information processing;
Second, third performance element all is connected with described the first performance element, and processor is realized by 1 MIC chip respectively, is used for and described the first performance element executed in parallel information processing.
Information processing heterogeneous system of the present invention is made of cpu chip and MIC chip, is preferably to adopt at present popular two-way cpu chip and at least two MIC chips, can the Effective Raise system performance, satisfy the requirement of performance application.
Description of drawings
Fig. 1 is the modular structure synoptic diagram of information processing heterogeneous system embodiment 1 of the present invention;
Fig. 2 is the modular structure synoptic diagram of information processing heterogeneous system embodiment 2 of the present invention;
Fig. 3 is PSTM serial operational effect figure;
Fig. 4 is for adopting the operational effect figure of the operation PSTM of system of the present invention.
Embodiment
Embodiment 1
The present invention is based on CPU and Intel MIC (Intel Many Integrated Core, integrated many nuclears) information processing heterogeneous system, as shown in Figure 1, this system comprises:
A central processing unit (CPU) platform, described platform comprises cpu chip;
At least one integrated many nuclear (MIC) chip;
Connector is used for connecting described MIC chip to described CPU platform.
Particularly, described connector is the PCIE slot.
MIC is Intel Company's exploitation, and the crowd who is used for high performance parallel computation examines chip.It is to develop from existing Xeon processor product basis, and it aims at very-high performance and calculates and living new architecture.Formal product based on the MIC framework is Xeon Phi.It is not that wish replaces CPU in computer architecture, but exist as coprocessor.The MIC chip has the x86 core of simplifying more than 50 usually, and each core supports 4 hardware threads, but the number of tasks of executed in parallel reach more than 200, the computing power of highly-parallel is provided, the smart peak performance of its pair reaches 1TFlops.The MIC technology will be accelerated the development of high-performance calculation, the performance bottleneck that the quick solution high-performance calculation is used.
This system uses for high-performance calculation, adopt the CPU/MIC isomeric architecture, merge the multinuclear computing power of CPU platform and the crowd of MIC and assessed the calculation ability, take full advantage of the computing power of two kinds of chips, make the two all common participation calculating, thereby the computing power of system is strengthened greatly, solved the performance bottleneck that high-performance calculation is used, so this system is a high performance system.Simultaneously this system or a low energy consumption system, its power dissipation ratio of performance is higher than isomorphism CPU platform far away, and whole system has been saved energy consumption in the high performance while of acquisition, so generally speaking, this system is a high-effect system.
The memory configurations of described system is more than the 96GB, and peak power is supported more than the 1300w.
MIC is all supported in the operating system of described CPU platform, compiler and driving.
Described operating system is Linux, and described compiler is icc, icpc, the ifort of Intel.
Preferably, described system comprises 2 cpu chips and 2 MIC chips, and described cpu chip comprises 6 cores, and described MIC chip comprises 50 more than the core.
In order to make the purpose, technical solutions and advantages of the present invention more clear, below in conjunction with drawings and Examples, the present invention is described in detail below.
Information processing heterogeneous system of the present invention is realized high-performance, low-power consumption based on CPU/MIC isomery framework.Below describe from hardware components and system environments configuration two inventions:
Hardware components:
The CPU platform adopts two-way, supports that 2 CPU work simultaneously, and this implementation process system adopts 2 intel Xeon56756 nuclear CPU, and dominant frequency is 3.07GHz
System can insert 2 MIC chips with two above PCIE slots, and native system adopts 2 MIC chips, has 50 more than the core on each card.
It is large that the memory configurations of system is wanted, and is more than 2 times of original CPU platform.The above internal memory of native system configuration 96GB.
System power dissipation is supported more than the 1300w, guarantees that whole system runs well, and the native system peak power is supported 1300w.
The system environments configuration:
Operating system is supported MIC, needs to install (SuSE) Linux OS.This implementation process adopts Red Hat Enterprise Linux 6.0GA 64-bit kernel 2.6.32-71;
Compiler is supported MIC, can adopt icc, icpc, the ifrt compiler of Intel, adopts Intel compiler l ccompxe 2013beta.0.047 in the present embodiment;
Support the driving of MIC, adopt KNC-AlphaUpdate1-2.1.2430-9.
Embodiment 2
This system will realize efficiently, and necessary Codesign allows the application software operation run in this system most effective.
Given this, information processing heterogeneous system provided by the invention also can be from being described with lower angle, and as shown in Figure 2, this system comprises:
The first performance element, its processor are realized by 2 cpu chips, are used for carrying out information processing;
Second, third performance element all is connected with described the first performance element, and processor is realized by 1 MIC chip respectively, is used for and described the first performance element executed in parallel information processing;
Particularly, described first, second, third performance element adopts the mode of multithreading to carry out information processing, and described first, second, third performance element is carried out information processing based on the principle of load balancing.
Wherein, described the first performance element starts 12 thread execution information processings, described second, third performance elements and starts respectively 200 above thread execution information processings.
Preferably, described cpu chip comprises at least 6 cores, and each has examined a thread, and described MIC chip comprises at least 50 cores, and each core can play 4 threads.
The server of main flow is two-way at present, namely inserts 2 CPU, the corresponding MIC of each piece CPU, and PCIE is most effective, and is best from the transmission of data performance between CPU and the MIC.
In order to test the performance of this system, can select high-performance calculation to use, this uses algorithm high parallel task, data are without dependence between the parallel task, and concurrency is good, and whole application requires high to system performance, seismic pre-stack time migration (PreStack Time Migration, PSTM) possess just above specific application, below be applied as example with this, existing CPU platform with single-threaded operation is carried out improved process describe:
Original PSTM program is with the single-threaded CPU platform that operates in, at first utilize the CPU multi-core platform, adopt the OpenMP programming model that it is realized with multithreading, adopt used calculation task 12 thread parallels to get up, the computing power of all nuclears of 2 CPU is all brought into play;
Then on PSTM CPU multi-threaded parallel procedure basis, realize the thread expansion at the MIC chip, all calculation task numbers are adopted 200 walk abreast more than the thread, make its executed in parallel on MIC, given play to the computing power of the many nuclears of MIC;
The computing power of whole system is divided into 3 equipment, and first MIC chip starts 200 more than the thread as equipment 0, and second MIC chip starts 200 more than the thread as equipment 1, and 2 CPU start 12 threads as equipment 2; As shown in Figure 2;
The calculation task of whole PSTM is divided according to the computing power of these three equipment, make the simultaneously parallel computation of three equipment, namely above common participation of these 412 threads calculated, and reaches the effect that CPU and MIC calculate simultaneously, and proof load is balanced, and whole system realizes high-performance.
Particularly, to test 91 surveys line, 963 CMP (common midpoint) point on the every survey line, inputting 110000 track datas, to carry out migration imaging be example, under original CPU isomorphism system, PSTM take time of single-threaded serial mode cost as 76053s, and native system working time is 1075s, performance promotes greatly.The imaging effect figure of CPU serial version PSTM operation sees shown in the accompanying drawing (3) that the imaging effect figure of native system operation sees that shown in the accompanying drawing (4), wherein horizontal ordinate is the common midpoint of certain bar side line, ordinate is the time, from image, two width of cloth images are basically identical, illustrate that operation result is correct.
System of the present invention, this system have high-performance, low-power consumption characteristics, with solving performance bottleneck and the power problems of performance application, satisfy actual production and scientific research demand, and reduce machine room construction cost and management, operation, maintenance cost.Among the present invention, CPU not only participates in logical calculated, also participate in intensive core calculations, and MIC only participates in the core intensive calculations, and CPU and MIC calculate jointly, realizes maximizing performance.
Can find out that from seismic pre-stack time migration embodiment whole system realizes high-performance, low-power consumption, greatly satisfy scientific research requirement and the demand of industrial production of performance application, this system has also reduced machine room construction cost and management, operation, maintenance cost.
Claims (10)
1. an information processing heterogeneous system is characterized in that, this system comprises:
A central processing unit (CPU) platform, described platform comprises cpu chip;
At least one integrated many nuclear (MIC) chip;
Connector is used for connecting described MIC chip to described CPU platform.
2. the system as claimed in claim 1, it is characterized in that: described connector is the PCIE slot.
3. the system as claimed in claim 1, it is characterized in that: the memory configurations of described system is more than the 96GB, peak power is supported more than the 1300w.
4. the system as claimed in claim 1 is characterized in that: the operating system of described CPU platform, compiler and drive and all support MIC.
5. the system as claimed in claim 1, it is characterized in that: described operating system is LinuX, described compiler is icc, icpc, the ifort of Intel.
6. the system as claimed in claim 1, it is characterized in that: described system comprises 2 cpu chips and 2 MIC chips, and described cpu chip comprises 6 cores, and described MIC chip comprises at least 50 cores.
7. an information processing heterogeneous system is characterized in that, this system comprises:
The first performance element, its processor are realized by 2 cpu chips, are used for carrying out information processing;
Second, third performance element all is connected with described the first performance element, and processor is realized by 1 MIC chip respectively, is used for and described the first performance element executed in parallel information processing.
8. system as claimed in claim 7 is characterized in that: described first, second, third performance element adopts the mode of multithreading to carry out information processing.
9. system as claimed in claim 7 is characterized in that: described first, second, third performance element is carried out information processing based on the principle of load balancing.
10. system as claimed in claim 7 is characterized in that: described the first performance element starts 12 thread execution information processings, described second, third performance elements and starts respectively at least 200 thread execution information processings.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012103392335A CN102902655A (en) | 2012-09-13 | 2012-09-13 | Information processing heterogeneous system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012103392335A CN102902655A (en) | 2012-09-13 | 2012-09-13 | Information processing heterogeneous system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102902655A true CN102902655A (en) | 2013-01-30 |
Family
ID=47574895
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012103392335A Pending CN102902655A (en) | 2012-09-13 | 2012-09-13 | Information processing heterogeneous system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102902655A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103049329A (en) * | 2012-11-22 | 2013-04-17 | 浪潮电子信息产业股份有限公司 | High-efficiency system based on central processing unit (CPU)/many integrated core (MIC) heterogeneous system structure |
CN103744898A (en) * | 2013-12-25 | 2014-04-23 | 浪潮电子信息产业股份有限公司 | Dependent library automatic processing method of MIC (Many Integrated Core) native mode program |
CN105893151A (en) * | 2016-04-01 | 2016-08-24 | 浪潮电子信息产业股份有限公司 | High-dimensional data flow processing method based on CPU-MIC heterogeneous platform |
CN110287139A (en) * | 2019-06-13 | 2019-09-27 | 深圳大学 | CPU and MIC cooperated computing method, apparatus and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101526934A (en) * | 2009-04-21 | 2009-09-09 | 浪潮电子信息产业股份有限公司 | Construction method of GPU and CPU combined processor |
US20120192198A1 (en) * | 2011-01-24 | 2012-07-26 | Nec Laboratories America, Inc. | Method and System for Memory Aware Runtime to Support Multitenancy in Heterogeneous Clusters |
-
2012
- 2012-09-13 CN CN2012103392335A patent/CN102902655A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101526934A (en) * | 2009-04-21 | 2009-09-09 | 浪潮电子信息产业股份有限公司 | Construction method of GPU and CPU combined processor |
US20120192198A1 (en) * | 2011-01-24 | 2012-07-26 | Nec Laboratories America, Inc. | Method and System for Memory Aware Runtime to Support Multitenancy in Heterogeneous Clusters |
Non-Patent Citations (3)
Title |
---|
ALEJANDRO DURAN等: "The Intel® Many Many Integrated Core Architecture", 《INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND SIMULATION》, 6 July 2012 (2012-07-06), pages 365 - 366 * |
ERIK SAULE等: "An Early Evaluation of the Scalability of Graph Algorithms on the Intel MIC Architecture", 《INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS&PHD FORUM》, 25 May 2012 (2012-05-25), pages 1629 - 1630 * |
张清: "浪潮异构HPC三大方案", 《科技浪潮》, 15 October 2011 (2011-10-15) * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103049329A (en) * | 2012-11-22 | 2013-04-17 | 浪潮电子信息产业股份有限公司 | High-efficiency system based on central processing unit (CPU)/many integrated core (MIC) heterogeneous system structure |
CN103744898A (en) * | 2013-12-25 | 2014-04-23 | 浪潮电子信息产业股份有限公司 | Dependent library automatic processing method of MIC (Many Integrated Core) native mode program |
CN105893151A (en) * | 2016-04-01 | 2016-08-24 | 浪潮电子信息产业股份有限公司 | High-dimensional data flow processing method based on CPU-MIC heterogeneous platform |
CN105893151B (en) * | 2016-04-01 | 2019-03-08 | 浪潮电子信息产业股份有限公司 | A kind of processing method of the High Dimensional Data Streams based on CPU+MIC heterogeneous platform |
CN110287139A (en) * | 2019-06-13 | 2019-09-27 | 深圳大学 | CPU and MIC cooperated computing method, apparatus and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101901042B (en) | Method for reducing power consumption based on dynamic task migrating technology in multi-GPU (Graphic Processing Unit) system | |
Liao et al. | MilkyWay-2 supercomputer: system and application | |
Reaño et al. | Local and remote GPUs perform similar with EDR 100G InfiniBand | |
Gilani et al. | Power-efficient computing for compute-intensive GPGPU applications | |
CN103279446A (en) | Isomerism mixed calculation multi-platform system using central processing unit (CPU)+graphic processing unit (GPU)+many integrated core (MIC) | |
Gao et al. | A survey of homogeneous and heterogeneous system architectures in high performance computing | |
CN102902655A (en) | Information processing heterogeneous system | |
CN102253919A (en) | Concurrent numerical simulation method and system based on GPU and CPU cooperative computing | |
CN103049329A (en) | High-efficiency system based on central processing unit (CPU)/many integrated core (MIC) heterogeneous system structure | |
CN103294639A (en) | CPU+MIC mixed heterogeneous cluster system for achieving large-scale computing | |
Cesini et al. | Power-efficient computing: experiences from the COSA project | |
Wang et al. | Task scheduling of parallel processing in CPU-GPU collaborative environment | |
Zhang et al. | Comparison and analysis of GPGPU and parallel computing on multi-core CPU | |
He et al. | Haas: Cloud-based real-time data analytics with heterogeneity-aware scheduling | |
Chen et al. | Integrated research of parallel computing: Status and future | |
Childs et al. | Particle advection performance over varied architectures and workloads | |
Zhou et al. | Parallel data cube computation on graphic processing units | |
CN203465722U (en) | Computer system facing multi-scale calculation | |
Rojek et al. | Parallelization of EULAG model on multicore architectures with GPU accelerators | |
CN102866423A (en) | Seismic prestack time migration processing method and system | |
Gong et al. | Optimizing Sweep3D for graphic processor unit | |
Jararweh et al. | Gpu-based personal supercomputing | |
US20150106589A1 (en) | Small form high performance computing mini hpc | |
Wang et al. | Data motion acceleration: Chaining cross-domain multi accelerators | |
CN103019323A (en) | Binary-star computer server based on Loongson processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20130130 |